<htmlPUBLIC"-//W3C//DTDHTML4.0Transitional//EN""http://www.w3.org/TR/REC-html40/loose.dtd">
MasteringEmbeddedLinuxProgramming
SecondEdition
UnleashthefullpotentialofEmbeddedLinux
ChrisSimmonds
BIRMINGHAM-MUMBAI
<htmlPUBLIC"-//W3C//DTDHTML4.0Transitional//EN""http://www.w3.org/TR/REC-html40/loose.dtd">
MasteringEmbeddedLinuxProgramming
SecondEdition
Copyright©2017PacktPublishing
Allrightsreserved.Nopartofthisbookmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformorbyanymeans,withoutthepriorwrittenpermissionofthepublisher,exceptinthecaseofbriefquotationsembeddedincriticalarticlesorreviews.Everyefforthasbeenmadeinthepreparationofthisbooktoensuretheaccuracyoftheinformationpresented.However,theinformationcontainedinthisbookissoldwithoutwarranty,eitherexpressorimplied.Neithertheauthor,norPacktPublishing,anditsdealersanddistributorswillbeheldliableforanydamagescausedorallegedtobecauseddirectlyorindirectlybythisbook.PacktPublishinghasendeavoredtoprovidetrademarkinformationaboutallofthecompaniesandproductsmentionedinthisbookbytheappropriateuseofcapitals.However,PacktPublishingcannotguaranteetheaccuracyofthisinformation.
Firstpublished:December2015
Secondedition:June2017
Productionreference:1280617
PublishedbyPacktPublishingLtd.LiveryPlace35LiveryStreetBirminghamB32PB,UK.
ISBN978-1-78728-328-2
www.packtpub.com
Credits
Author
ChrisSimmonds
CopyEditors
MadhusudanUchil
StutiShrivastava
Reviewers
DaianeAngolini
OtavioSalvador
AlexTereschenko
ProjectCoordinator
VirginiaDias
CommissioningEditor
KartikeyPandey
Proofreader
SafisEditing
AcquisitionEditor
PrateekBharadwaj
Indexer
RekhaNair
ContentDevelopmentEditor
SharonRaj
Graphics
KirkD'Penha
TechnicalEditor
VishalKamalMewada
ProductionCoordinator
MelwynDsa
AbouttheAuthorChrisSimmondsisasoftwareconsultantandtrainerlivinginsouthernEngland.Hehasalmosttwodecadesofexperienceindesigningandbuildingopen-sourceembeddedsystems.Heisthefounderandchiefconsultantat2netLtd,whichprovidesprofessionaltrainingandmentoringservicesinembeddedLinux,Linuxdevicedrivers,andAndroidplatformdevelopment.Hehastrainedengineersatmanyofthebiggestcompaniesintheembeddedworld,includingARM,Qualcomm,Intel,Ericsson,andGeneralDynamics.Heisafrequentpresenteratopensourceandembeddedconferences,includingtheEmbeddedLinuxConferenceandEmbeddedWorld.YoucanseesomeofhisworkontheInnerPenguinblogatwww.2net.co.uk.
IwouldliketothankShirleySimmondsforbeingsosupportiveduringthelonghoursthatIwasshutinmyhomeofficeresearchingandwritingthisbook.Iwouldalsoliketothankallthepeoplewhohavehelpedmewiththeresearchofthetechnicalaspectsofthisbook,whethertheyrealizedthatiswhattheyweredoingornot.Inparticular,IwouldliketomentionKlaasvanGend,ThomasPetazzoni,andRalphNguyenfortheirhelpandadvice.Lastly,IwouldliketothankSharonRaj,VishalMewada,andtheteamatPacktPublishingforkeepingmeontrackandbringingthebooktofruition.
AbouttheReviewersDaianeAngolinihasbeenworkingwithembeddedLinuxsince2008.ShehasbeenworkingasanapplicationengineeratNXP,actingoninternaldevelopment,portingcustomapplicationsfromAndroid,andon-customersupportfori.MXarchitecturesinareassuchasLinuxkernel,u-boot,Android,YoctoProject,anduser-spaceapplications.However,itwasontheYoctoProjectthatshefoundherplace.ShehascoauthoredthebooksEmbeddedLinuxDevelopmentwithYoctoProjectandHeadingfortheYoctoProject,andlearnedalotintheprocess.
OtavioSalvadorlovestechnologyandstartedhisfreesoftwareactivitiesin1999.In2002,hefoundedO.S.Systems,acompanyfocusedonembeddedsystemdevelopmentservicesandconsultancyworldwide,creatingandmaintainingcustomizedBSPs,andhelpingcompanieswiththeirproduct'sdevelopmentchallenges.ThisresultedinhimjoiningtheOpenEmbeddedcommunityin2008,whenhebecameanactivecontributortotheOpenEmbeddedproject.HehascoauthoredthebooksEmbeddedLinuxDevelopmentwithYoctoProjectandHeadingfortheYoctoProject.
AlexTereschenkoisanembeddedsystemsengineerbyday,andanavidmakerbynight,whoisconvincedthatcomputerscandoalotofgoodforpeoplewhentheyareinterfacedwithreal-worldobjects,asopposedtojustcrunchingdatainadustycorner.That'swhat'sdrivinghiminhisprojects,andthisiswhyembeddedsystemsandtheInternetofThingsarethetopicsheenjoysthemost.
www.PacktPub.comForsupportfilesanddownloadsrelatedtoyourbook,pleasevisitwww.PacktPub.com.DidyouknowthatPacktofferseBookversionsofeverybookpublished,withPDFandePubfilesavailable?YoucanupgradetotheeBookversionatwww.PacktPub.comandasaprintbookcustomer,youareentitledtoadiscountontheeBookcopy.Getintouchwithusatservice@packtpub.comformoredetails.Atwww.PacktPub.com,youcanalsoreadacollectionoffreetechnicalarticles,signupforarangeoffreenewslettersandreceiveexclusivediscountsandoffersonPacktbooksandeBooks.
https://www.packtpub.com/mapt
Getthemostin-demandsoftwareskillswithMapt.MaptgivesyoufullaccesstoallPacktbooksandvideocourses,aswellasindustry-leadingtoolstohelpyouplanyourpersonaldevelopmentandadvanceyourcareer.
Whysubscribe?FullysearchableacrosseverybookpublishedbyPacktCopyandpaste,print,andbookmarkcontentOndemandandaccessibleviaawebbrowser
CustomerFeedbackThanksforpurchasingthisPacktbook.AtPackt,qualityisattheheartofoureditorialprocess.Tohelpusimprove,pleaseleaveusanhonestreviewonthisbook'sAmazonpageathttps://www.amazon.com/dp/1787283283.
Ifyou'dliketojoinourteamofregularreviewers,youcane-mailusatcustomerreviews@packtpub.com.WeawardourregularreviewerswithfreeeBooksandvideosinexchangefortheirvaluablefeedback.Helpusberelentlessinimprovingourproducts!
TableofContentsPreface
WhatthisbookcoversWhatyouneedforthisbookWhothisbookisforConventionsReaderfeedbackCustomersupport
DownloadingtheexamplecodeDownloadingthecolorimagesofthisbookErrataPiracyQuestions
1. StartingOutSelectingtherightoperatingsystemTheplayersProjectlifecycle
ThefourelementsofembeddedLinuxOpensource
LicensesHardwareforembeddedLinuxHardwareusedinthisbook
TheBeagleBoneBlackQEMU
SoftwareusedinthisbookSummary
2. LearningAboutToolchainsIntroducingtoolchains
TypesoftoolchainsCPUarchitecturesChoosingtheClibrary
FindingatoolchainBuildingatoolchainusingcrosstool-NG
Installingcrosstool-NGBuildingatoolchainforBeagleBoneBlackBuildingatoolchainforQEMU
Anatomyofatoolchain
FindingoutaboutyourcrosscompilerThesysroot,library,andheaderfilesOthertoolsinthetoolchainLookingatthecomponentsoftheClibrary
Linkingwithlibraries–staticanddynamiclinkingStaticlibrariesSharedlibraries
UnderstandingsharedlibraryversionnumbersTheartofcrosscompiling
SimplemakefilesAutotools
Anexample:SQLitePackageconfigurationProblemswithcrosscompiling
Summary3. AllAboutBootloaders
Whatdoesabootloaderdo?Thebootsequence
Phase1–ROMcodePhase2–secondaryprogramloaderPhase3–TPL
BootingwithUEFIfirmwareMovingfrombootloadertokernelIntroducingdevicetrees
DevicetreebasicsTheregpropertyLabelsandinterruptsDevicetreeincludefilesCompilingadevicetree
ChoosingabootloaderU-Boot
BuildingU-BootInstallingU-BootUsingU-Boot
EnvironmentvariablesBootimageformatLoadingimages
BootingLinuxAutomatingthebootwithU-Bootscripts
PortingU-Boottoanewboard
Board-specificfilesConfiguringheaderfiles
BuildingandtestingFalconmode
BareboxGettingbareboxBuildingbareboxUsingbarebox
Summary4. ConfiguringandBuildingtheKernel
Whatdoesthekerneldo?Choosingakernel
KerneldevelopmentcycleStableandlongtermsupportreleasesVendorsupportLicensing
BuildingthekernelGettingthesourceUnderstandingkernelconfiguration–KconfigUsingLOCALVERSIONtoidentifyyourkernelKernelmodules
Compiling–KbuildFindingoutwhichkerneltargettobuildBuildartifactsCompilingdevicetreesCompilingmodulesCleaningkernelsourcesBuildingakernelfortheBeagleBoneBlackBuildingakernelforQEMU
BootingthekernelBootingtheBeagleBoneBlackBootingQEMUKernelpanicEarlyuserspaceKernelmessagesKernelcommandline
PortingLinuxtoanewboardAnewdevicetreeSettingtheboardcompatibleproperty
AdditionalreadingSummary
5. BuildingaRootFilesystemWhatshouldbeintherootfilesystem?
ThedirectorylayoutThestagingdirectoryPOSIXfileaccesspermissionsFileownershippermissionsinthestagingdirectoryProgramsfortherootfilesystem
TheinitprogramShellUtilitiesBusyBoxtotherescue!BuildingBusyBoxToyBox–analternativetoBusyBox
LibrariesfortherootfilesystemReducingthesizebystripping
DevicenodesTheprocandsysfsfilesystems
MountingfilesystemsKernelmodules
TransferringtherootfilesystemtothetargetCreatingabootinitramfs
StandaloneinitramfsBootingtheinitramfsBootingwithQEMUBootingtheBeagleBoneBlack
MountingprocBuildinganinitramfsintothekernelimageBuildinganinitramfsusingadevicetableTheoldinitrdformat
TheinitprogramStartingadaemonprocess
ConfiguringuseraccountsAddinguseraccountstotherootfilesystem
AbetterwayofmanagingdevicenodesAnexampleusingdevtmpfsAnexampleusingmdevArestaticdevicenodessobadafterall?
Configuringthenetwork
NetworkcomponentsforglibcCreatingfilesystemimageswithdevicetables
BootingtheBeagleBoneBlackMountingtherootfilesystemusingNFS
TestingwithQEMUTestingwiththeBeagleBoneBlackProblemswithfilepermissions
UsingTFTPtoloadthekernelAdditionalreadingSummary
6. SelectingaBuildSystemBuildsystemsPackageformatsandpackagemanagersBuildroot
BackgroundStablereleasesandlong-termsupportInstallingConfiguringRunningCreatingacustomBSP
U-BootLinuxBuild
AddingyourowncodeOverlaysAddingapackage
LicensecomplianceTheYoctoProject
BackgroundStablereleasesandsupportsInstallingtheYoctoProjectConfiguringBuildingRunningtheQEMUtargetLayers
BitBakeandrecipesCustomizingimagesvialocal.confWritinganimagerecipeCreatinganSDK
ThelicenseauditFurtherreadingSummary
7. CreatingaStorageStrategyStorageoptions
NORflashNANDflashManagedflash
MultiMediaCardandSecureDigitalcardseMMCOthertypesofmanagedflash
AccessingflashmemoryfromthebootloaderU-BootandNORflashU-BootandNANDflashU-BootandMMC,SD,andeMMC
AccessingflashmemoryfromLinuxMemorytechnologydevices
MTDpartitionsMTDdevicedriversTheMTDcharacterdevice,mtdTheMTDblockdevice,mtdblockLoggingkerneloopstoMTDSimulatingNANDmemory
TheMMCblockdriverFilesystemsforflashmemory
FlashtranslationlayersFilesystemsforNORandNANDflashmemory
JFFS2SummarynodesCleanmarkersCreatingaJFFS2filesystem
YAFFS2CreatingaYAFFS2filesystem
UBIandUBIFSUBIUBIFS
FilesystemsformanagedflashFlashbenchDiscardandTRIMExt4
F2FSFAT16/32
Read-onlycompressedfilesystemssquashfs
TemporaryfilesystemsMakingtherootfilesystemread-onlyFilesystemchoicesFurtherreadingSummary
8. UpdatingSoftwareintheFieldWhattoupdate?
BootloaderKernelRootfilesystemSystemapplicationsDevice-specificdataComponentsthatneedtobeupdated
ThebasicsofsoftwareupdateMakingupdatesrobustMakingupdatesfail-safeMakingupdatessecure
TypesofupdatemechanismSymmetricimageupdateAsymmetricimageupdateAtomicfileupdates
OTAupdatesUsingMenderforlocalupdates
BuildingtheMenderclientInstallinganupdate
UsingMenderforOTAupdatesSummary
9. InterfacingwithDeviceDriversTheroleofdevicedriversCharacterdevicesBlockdevicesNetworkdevicesFindingoutaboutdriversatruntime
GettinginformationfromsysfsThedevices:/sys/devices
Thedrivers:/sys/classTheblockdrivers:/sys/block
FindingtherightdevicedriverDevicedriversinuserspace
GPIOHandlinginterruptsfromGPIO
LEDsI2CSerialPeripheralInterface(SPI)
WritingakerneldevicedriverDesigningacharacterdriverinterfaceTheanatomyofadevicedriverCompilingkernelmodulesLoadingkernelmodules
DiscoveringthehardwareconfigurationDevicetreesTheplatformdataLinkinghardwarewithdevicedrivers
AdditionalreadingSummary
10. StartingUp–TheinitProgramAfterthekernelhasbootedIntroducingtheinitprogramsBusyBoxinit
BuildrootinitscriptsSystemVinit
inittabTheinit.dscriptsAddinganewdaemonStartingandstoppingservices
systemdBuildingsystemdwiththeYoctoProjectandBuildrootIntroducingtargets,services,andunits
UnitsServicesTargets
HowsystemdbootsthesystemAddingyourownserviceAddingawatchdog
ImplicationsforembeddedLinuxFurtherreadingSummary
11. ManagingPowerMeasuringpowerusageScalingtheclockfrequency
TheCPUFreqdriverUsingCPUFreq
SelectingthebestidlestateTheCPUIdledriverTicklessoperation
PoweringdownperipheralsPuttingthesystemtosleep
PowerstatesWakeupeventsTimedwakeupsfromthereal-timeclock
FurtherreadingSummary
12. LearningAboutProcessesandThreadsProcessorthread?Processes
CreatinganewprocessTerminatingaprocessRunningadifferentprogramDaemonsInter-processcommunication
Message-basedIPCUnix(orlocal)socketsFIFOsandnamedpipesPOSIXmessagequeues
Summaryofmessage-basedIPCSharedmemory-basedIPC
POSIXsharedmemoryThreads
CreatinganewthreadTerminatingathreadCompilingaprogramwiththreadsInter-threadcommunicationMutualexclusion
ChangingconditionsPartitioningtheproblem
SchedulingFairnessversusdeterminismTime-sharedpolicies
NicenessReal-timepoliciesChoosingapolicyChoosingareal-timepriority
FurtherreadingSummary
13. ManagingMemoryVirtualmemorybasicsKernelspacememorylayout
Howmuchmemorydoesthekerneluse?UserspacememorylayoutTheprocessmemorymapSwapping
Swappingtocompressedmemory(zram)Mappingmemorywithmmap
UsingmmaptoallocateprivatememoryUsingmmaptosharememoryUsingmmaptoaccessdevicememory
Howmuchmemorydoesmyapplicationuse?Per-processmemoryusage
UsingtopandpsUsingsmemOthertoolstoconsider
IdentifyingmemoryleaksmtraceValgrind
RunningoutofmemoryFurtherreadingSummary
14. DebuggingwithGDBTheGNUdebuggerPreparingtodebugDebuggingapplications
Remotedebuggingusinggdbserver
SettinguptheYoctoProjectforremotedebuggingSettingupBuildrootforremotedebuggingStartingtodebug
ConnectingGDBandgdbserverSettingthesysrootGDBcommandfilesOverviewofGDBcommands
BreakpointsRunningandsteppingGettinginformation
RunningtoabreakpointNativedebugging
TheYoctoProjectBuildroot
Just-in-timedebuggingDebuggingforksandthreadsCorefiles
UsingGDBtolookatcorefilesGDBuserinterfaces
TerminaluserinterfaceDatadisplaydebuggerEclipse
DebuggingkernelcodeDebuggingkernelcodewithkgdbAsampledebugsessionDebuggingearlycodeDebuggingmodulesDebuggingkernelcodewithkdbLookingatanOopsPreservingtheOops
FurtherreadingSummary
15. ProfilingandTracingTheobservereffect
SymboltablesandcompileflagsBeginningtoprofileProfilingwithtopPoorman'sprofilerIntroducingperf
ConfiguringthekernelforperfBuildingperfwiththeYoctoProjectBuildingperfwithBuildrootProfilingwithperfCallgraphsperfannotate
Otherprofilers–OProfileandgprofTracingeventsIntroducingFtrace
PreparingtouseFtraceUsingFtraceDynamicFtraceandtracefiltersTraceevents
UsingLTTngLTTngandtheYoctoProjectLTTngandBuildrootUsingLTTngforkerneltracing
UsingValgrindCallgrindHelgrind
UsingstraceSummary
16. Real-TimeProgrammingWhatisrealtime?Identifyingsourcesofnon-determinismUnderstandingschedulinglatencyKernelpreemptionThereal-timeLinuxkernel(PREEMPT_RT)
ThreadedinterrupthandlersPreemptiblekernellocks
GettingthePREEMPT_RTpatchesTheYoctoProjectandPREEMPT_RT
High-resolutiontimersAvoidingpagefaultsInterruptshieldingMeasuringschedulinglatencies
cyclictestUsingFtraceCombiningcyclictestandFtrace
FurtherreadingSummary
PrefaceLinuxhasbeenthemainstayofembeddedcomputingformanyyears.Andyet,thereareremarkablyfewbooksthatcoverthetopicasawhole:thisbookisintendedtofillthatgap.ThetermembeddedLinuxisnotwell-defined,andcanbeappliedtotheoperatingsysteminsideawiderangeofdevicesrangingfromthermostatstoWi-Firouterstoindustrialcontrolunits.However,theyareallbuiltonthesamebasicopensourcesoftware.ThosearethetechnologiesthatIdescribeinthisbook,basedonmyexperienceasanengineerandthematerialsIhavedevelopedformytrainingcourses.
Technologydoesnotstandstill.TheindustrybasedaroundembeddedcomputingisjustassusceptibletoMoore'slawasmainstreamcomputing.Theexponentialgrowththatthisimplieshasmeantthatasurprisinglylargenumberofthingshavechangedsincethefirsteditionofthisbookwaspublished.Thissecondeditionisfullyrevisedtousethelatestversionsofthemajoropensourcecomponents,whichincludeLinux4.9,YoctoProject2.2Morty,andBuildroot2017.02.SinceitisclearthatembeddedLinuxwillplayanimportantpartintheInternetofThings,thereisanewchapterontheupdatingofdevicesinthefield,includingOvertheAirupdates.Anothertrendisthequesttoreducepowerconsumption,bothtoextendthebatterylifeofmobiledevicesandtoreduceenergycosts.Thechapteronpowermanagementshowshowthisisdone.
MasteringEmbeddedLinuxProgrammingcoversthetopicsinroughlytheorderthatyouwillencountertheminareal-lifeproject.Thefirst6chaptersareconcernedwiththeearlystagesoftheproject,coveringbasicssuchasselectingthetoolchain,thebootloader,andthekernel.Attheconclusionofthisthissection,Iintroducetheideaofusinganembeddedbuildtool,usingBuildrootandtheYoctoProjectasexamples.
Themiddlepartofthebook,chapters7throughto13,willhelpyouintheimplementationphaseoftheproject.Itcoversthetopicsoffilesystems,theinitprogram,multithreadedprogramming,softwareupdate,andpowermanagement.Thethirdsection,chapters14and15,showyouhowtomakeeffectiveuseofthemanydebugandprofilingtoolsthatLinuxhastoofferinordertodetect
problemsandidentifybottlenecks.ThefinalchapterbringstogetherseveralthreadstoexplainhowLinuxcanbeusedinreal-timeapplications.
EachchapterintroducesamajorareaofembeddedLinux.Itdescribesthebackgroundsothatyoucanlearnthegeneralprinciples,butitalsoincludesdetailedworkedexamplesthatillustrateeachoftheseareas.Youcantreatthisasabookoftheory,orabookofexamples.Itworksbestifyoudoboth:understandthetheoryandtryitoutinreallife.
WhatthisbookcoversChapter1,StartingOut,setsthescenebydescribingtheembeddedLinuxecosystemandthechoicesavailabletoyouasyoustartyourproject.
Chapter2,LearningAboutToolchains,describesthecomponentsofatoolchainandshowsyouhowtocreateatoolchainforcross-compilingcodeforthetargetboard.Itdescribeswheretogetatoolchainandprovidesdetailsonhowtobuildonefromthesourcecode.
Chapter3,AllAboutBootloaders,explainstheroleofthebootloaderinloadingtheLinuxkernelintomemory,andusesU-BootandBarebootasexamples.ItalsointroducesdevicetreesasthemechanismusedtoencodethedetailsofhardwareinalmostallembeddedLinuxsystems.
Chapter4,ConfiguringandBuildingtheKernel,providesinformationonhowtoselectaLinuxkernelforanembeddedsystemandconfigureitforthehardwarewithinthedevice.ItalsocovershowtoportLinuxtothenewhardware.
Chapter5,BuildingaRootFilesystem,introducestheideasbehindtheuserspacepartofanembeddedLinuximplementationbymeansofastep-by-stepguideonhowtoconfigurearootfilesystem.
Chapter6,SelectingaBuildSystem,coverstwocommonlyusedembeddedLinuxbuildsystems,BuildrootandYoctoProject,whichautomatethestepsdescribedinthepreviousfourchapters.
Chapter7,CreatingaStorageStrategy,discussesthechallengescreatedbymanagingflashmemory,includingrawflashchipsandembeddedMMC(eMMC)packages.Itdescribesthefilesystemsthatareapplicabletoeachtypeoftechnology.
Chapter8,UpdatingSoftwareintheField,examinesvariouswaysofupdatingthesoftwareafterthedevicehasbeendeployed,andincludesfullymanagedOvertheAir(OTA)updates.Thekeytopicsunderdiscussionarereliabilityandsecurity.
Chapter9,InterfacingwithDeviceDrivers,describeshowkerneldevicedriversinteractwiththehardwarewithworkedexamplesofasimpledriver.Italsodescribesthevariouswaysofcallingdevicedriversfromtheuserspace.
Chapter10,StartingUp–TheInitProgram,showshowthefirstuserspaceprogram--init--startstherestofthesystem.Itdescribesthethreeversionsoftheinitprogram,eachsuitableforadifferentgroupofembeddedsystems,rangingfromthesimplicityoftheBusyBoxinit,throughSystemVinit,tothecurrentstate-of-the-art,systemd.
Chapter11,ManagingPower,considersthevariouswaysthatLinuxcanbetunedtoreducepowerconsumption,includingDynamicFrequencyandVoltagescaling,selectingdeeperidlestates,andsystemsuspend.Theaimistomakedevicesthatrunforlongeronabatterychargeandalsoruncooler.
Chapter12,LearningAboutProcessesandThreads,describesembeddedsystemsfromthepointofviewoftheapplicationprogrammer.Thischapterlooksatprocessesandthreads,inter-processcommunications,andschedulingpolicies
Chapter13,ManagingMemory,introducestheideasbehindvirtualmemoryandhowtheaddressspaceisdividedintomemorymappings.Italsodescribeshowtomeasurememoryusageaccuratelyandhowtodetectmemoryleaks.
Chapter14,DebuggingwithGDB,showsyouhowtousetheGNUdebugger,GDB,togetherwiththedebugagent,gdbserver,todebugapplicationsrunningremotelyonthetargetdevice.Itgoesontoshowhowyoucanextendthismodeltodebugkernelcode,makinguseofthekerneldebugstubs,KGDB.
Chapter15,ProfilingandTracing,coversthetechniquesavailabletomeasurethesystemperformance,startingfromwholesystemprofilesandthenzeroinginonparticularareaswherebottlenecksarecausingpoorperformance.ItalsodescribeshowtouseValgrindtocheckthecorrectnessofanapplication'suseofthreadsynchronizationandmemoryallocation.
Chapter16,Real-TimeProgramming,providesadetailedguidetoreal-timeprogrammingonLinux,includingtheconfigurationofthekernelandthePREEMPT_RTreal-timekernelpatch.Thekerneltracetool,Ftrace,isusedtomeasurekernellatenciesandshowtheeffectofthevariouskernel
configurations.
WhatyouneedforthisbookThesoftwareusedinthisbookisentirelyopensource.Inalmostallcases,Ihaveusedthelateststableversionsavailableatthetimeofwriting.WhileIhavetriedtodescribethemainfeaturesinamannerthatisnotversion-specific,itisinevitablethatsomeoftheexampleswillneedadaptationtoworkwithlatersoftware.
Embeddeddevelopmentinvolvestwosystems:thehost,whichisusedfordevelopingtheprograms,andthetarget,whichrunsthem.Forthehostsystem,IhaveusedUbuntu16.04,butmostLinuxdistributionswillworkwithjustalittlemodification.YoumaydecidetorunLinuxasaguestinavirtualmachine,butyoushouldbeawarethatsometasks,suchasbuildingadistributionusingtheYoctoProject,arequitedemandingandarebetterrunonanativeinstallationofLinux.
Ichosetwoexemplartargets:theQEMUemulatorandtheBeagleBoneBlack.UsingQEMUmeansthatyoucantryoutmostoftheexampleswithouthavingtoinvestinanyadditionalhardware.Ontheotherhand,somethingsworkbetterifyoudohaverealhardware,forwhich,IhavechosentheBeagleBoneBlackbecauseitisnotexpensive,itiswidelyavailable,andithasverygoodcommunitysupport.Ofcourse,youarenotlimitedtojustthesetwotargets.Theideabehindthebookistoprovideyouwithgeneralsolutionstoproblemssothatyoucanapplythemtoawiderangeoftargetboards.
WhothisbookisforThisbookiswrittenfordeveloperswhohaveaninterestinembeddedcomputingandLinux,andwanttoextendtheirknowledgeintothevariousbranchesofthesubject.Inwritingthebook,IassumeabasicunderstandingoftheLinuxcommandline,andintheprogrammingexamples,aworkingknowledgeoftheClanguage.Severalchaptersfocusonthehardwarethatgoesintoanembeddedtargetboard,and,so,afamiliaritywithhardwareandhardwareinterfaceswillbeadefiniteadvantageinthesecases.
ConventionsInthisbook,youwillfindanumberoftextstylesthatdistinguishbetweendifferentkindsofinformation.Herearesomeexamplesofthesestylesandanexplanationoftheirmeaning.Codewordsintext,databasetablenames,foldernames,filenames,fileextensions,pathnames,dummyURLs,userinput,andTwitterhandlesareshownasfollows:"Youconfiguretap0inexactlythesamewayasanyotherinterface."
Ablockofcodeissetasfollows:
/{
#address-cells=<2>;
#size-cells=<2>;
memory@80000000{
device_type="memory";
reg=<0x000000000x8000000000x80000000>;
};
};
Anycommand-lineinputoroutputiswrittenasfollows:
$mipsel-unkown-linux-gnu-gcc-dumpmachine
milsel-unknown-linux-gnu
Newtermsandimportantwordsareshowninbold.
Warningsorimportantnotesappearinaboxlikethis.
Tipsandtricksappearlikethis.
ReaderfeedbackFeedbackfromourreadersisalwayswelcome.Letusknowwhatyouthinkaboutthisbook-whatyoulikedordisliked.Readerfeedbackisimportantforusasithelpsusdeveloptitlesthatyouwillreallygetthemostoutof.Tosendusgeneralfeedback,[email protected],andmentionthebook'stitleinthesubjectofyourmessage.Ifthereisatopicthatyouhaveexpertiseinandyouareinterestedineitherwritingorcontributingtoabook,seeourauthorguideatwww.packtpub.com/authors.
CustomersupportNowthatyouaretheproudownerofaPacktbook,wehaveanumberofthingstohelpyoutogetthemostfromyourpurchase.
DownloadingtheexamplecodeYoucandownloadtheexamplecodefilesforthisbookfromyouraccountathttp://www.packtpub.com.Ifyoupurchasedthisbookelsewhere,youcanvisithttp://www.packtpub.com/supportandregistertohavethefilese-maileddirectlytoyou.Youcandownloadthecodefilesbyfollowingthesesteps:
1. Loginorregistertoourwebsiteusingyoure-mailaddressandpassword.2. HoverthemousepointerontheSUPPORTtabatthetop.3. ClickonCodeDownloads&Errata.4. EnterthenameofthebookintheSearchbox.5. Selectthebookforwhichyou'relookingtodownloadthecodefiles.
6. Choosefromthedrop-downmenuwhereyoupurchasedthisbookfrom.7. ClickonCodeDownload.
Oncethefileisdownloaded,pleasemakesurethatyouunziporextractthefolderusingthelatestversionof:
WinRAR/7-ZipforWindowsZipeg/iZip/UnRarXforMacTARforLinux
ThecodebundleforthebookisalsohostedonGitHubathttps://github.com/PacktPublishing/Mastering-Embedded-Linux-Programming-Second-Edition.Wealsohaveothercodebundlesfromourrichcatalogofbooksandvideosavailableathttps://github.com/PacktPublishing/.Checkthemout!
DownloadingthecolorimagesofthisbookWealsoprovideyouwithaPDFfilethathascolorimagesofthescreenshots/diagramsusedinthisbook.Thecolorimageswillhelpyoubetterunderstandthechangesintheoutput.Youcandownloadthisfilefromhttps://www.packtpub.com/sites/default/files/downloads/MasteringEmbeddedLinuxProgrammingSecondEdition_ColorImages.pdf.
ErrataAlthoughwehavetakeneverycaretoensuretheaccuracyofourcontent,mistakesdohappen.Ifyoufindamistakeinoneofourbooks-maybeamistakeinthetextorthecode-wewouldbegratefulifyoucouldreportthistous.Bydoingso,youcansaveotherreadersfromfrustrationandhelpusimprovesubsequentversionsofthisbook.Ifyoufindanyerrata,pleasereportthembyvisitinghttp://www.packtpub.com/submit-errata,selectingyourbook,clickingontheErrataSubmissionFormlink,andenteringthedetailsofyourerrata.Onceyourerrataareverified,yoursubmissionwillbeacceptedandtheerratawillbeuploadedtoourwebsiteoraddedtoanylistofexistingerrataundertheErratasectionofthattitle.Toviewthepreviouslysubmittederrata,gotohttps://www.packtpub.com/books/content/supportandenterthenameofthebookinthesearchfield.TherequiredinformationwillappearundertheErratasection.
PiracyPiracyofcopyrightedmaterialontheInternetisanongoingproblemacrossallmedia.AtPackt,wetaketheprotectionofourcopyrightandlicensesveryseriously.IfyoucomeacrossanyillegalcopiesofourworksinanyformontheInternet,pleaseprovideuswiththelocationaddressorwebsitenameimmediatelysothatwecanpursuearemedy.Pleasecontactusatcopyright@packtpub.comwithalinktothesuspectedpiratedmaterial.Weappreciateyourhelpinprotectingourauthorsandourabilitytobringyouvaluablecontent.
QuestionsIfyouhaveaproblemwithanyaspectofthisbook,[email protected],andwewilldoourbesttoaddresstheproblem.
StartingOutYouareabouttobeginworkingonyournextproject,andthistimeitisgoingtoberunningLinux.Whatshouldyouthinkaboutbeforeyouputfingertokeyboard?Let'sbeginwithahigh-levellookatembeddedLinuxandseewhyitispopular,whataretheimplicationsofopensourcelicenses,andwhatkindofhardwareyouwillneedtorunLinux.
Linuxfirstbecameaviablechoiceforembeddeddevicesaround1999.ThatwaswhenAxis(https://www.axis.com),releasedtheirfirstLinux-powerednetworkcameraandTiVo(https://business.tivo.com/)theirfirstDigitalVideoRecorder(DVR).Since1999,Linuxhasbecomeevermorepopular,tothepointthattodayitistheoperatingsystemofchoiceformanyclassesofproduct.Atthetimeofwriting,in2017,thereareabouttwobilliondevicesrunningLinux.ThatincludesalargenumberofsmartphonesrunningAndroid,whichusesaLinuxkernel,andhundredsofmillionsofset-top-boxes,smartTVs,andWi-Firouters,nottomentionaverydiverserangeofdevicessuchasvehiclediagnostics,weighingscales,industrialdevices,andmedicalmonitoringunitsthatshipinsmallervolumes.
So,whydoesyourTVrunLinux?Atfirstglance,thefunctionofaTVissimple:ithastodisplayastreamofvideoonascreen.WhyisacomplexUnix-likeoperatingsystemlikeLinuxnecessary?
ThesimpleanswerisMoore'sLaw:GordonMoore,co-founderofIntel,observedin1965thatthedensityofcomponentsonachipwilldoubleapproximatelyeverytwoyears.Thatappliestothedevicesthatwedesignanduseinoureverydaylivesjustasmuchasitdoestodesktops,laptops,andservers.Attheheartofmostembeddeddevicesisahighlyintegratedchipthatcontainsoneormoreprocessorcoresandinterfaceswithmainmemory,massstorage,andperipheralsofmanytypes.ThisisreferredtoasaSystemonChip,orSoC,andSoCsareincreasingincomplexityinaccordancewithMoore'sLaw.AtypicalSoChasatechnicalreferencemanualthatstretchestothousandsofpages.YourTVisnotsimplydisplayingavideostreamastheoldanalogsetsusedtodo.
Thestreamisdigital,possiblyencrypted,anditneedsprocessingtocreateanimage.YourTVis(orsoonwillbe)connectedtotheInternet.Itcanreceivecontentfromsmartphones,tablets,andhomemediaservers.Itcanbe(orsoonwillbe)usedtoplaygames.Andsoonandsoon.Youneedafulloperatingsystemtomanagethisdegreeofcomplexity.
HerearesomepointsthatdrivetheadoptionofLinux:
Linuxhasthenecessaryfunctionality.Ithasagoodscheduler,agoodnetworkstack,supportforUSB,Wi-Fi,Bluetooth,manykindsofstoragemedia,goodsupportformultimediadevices,andsoon.Itticksalltheboxes.Linuxhasbeenportedtoawiderangeofprocessorarchitectures,includingsomethatareverycommonlyfoundinSoCdesigns--ARM,MIPS,x86,andPowerPC.Linuxisopensource,soyouhavethefreedomtogetthesourcecodeandmodifyittomeetyourneeds.You,orsomeoneworkingonyourbehalf,cancreateaboardsupportpackageforyourparticularSoCboardordevice.Youcanaddprotocols,features,andtechnologiesthatmaybemissingfromthemainlinesourcecode.Youcanremovefeaturesthatyoudon'tneedtoreducememoryandstoragerequirements.Linuxisflexible.Linuxhasanactivecommunity;inthecaseoftheLinuxkernel,veryactive.Thereisanewreleaseofthekernelevery8to10weeks,andeachreleasecontainscodefrommorethan1,000developers.AnactivecommunitymeansthatLinuxisuptodateandsupportscurrenthardware,protocols,andstandards.Opensourcelicensesguaranteethatyouhaveaccesstothesourcecode.Thereisnovendortie-in.
Forthesereasons,Linuxisanidealchoiceforcomplexdevices.ButthereareafewcaveatsIshouldmentionhere.Complexitymakesithardertounderstand.Coupledwiththefastmovingdevelopmentprocessandthedecentralizedstructuresofopensource,youhavetoputsomeeffortintolearninghowtouseitandtokeeponre-learningasitchanges.Ihopethatthisbookwillhelpintheprocess.
SelectingtherightoperatingsystemIsLinuxsuitableforyourproject?Linuxworkswellwheretheproblembeingsolvedjustifiesthecomplexity.Itisespeciallygoodwhereconnectivity,robustness,andcomplexuserinterfacesarerequired.However,itcannotsolveeveryproblem,soherearesomethingstoconsiderbeforeyoujumpin:
Isyourhardwareuptothejob?Comparedtoatraditionalreal-timeoperatingsystem(RTOS)suchasVxWorks,Linuxrequiresalotmoreresources.Itneedsatleasta32-bitprocessorandlotsmorememory.Iwillgointomoredetailinthesectionontypicalhardwarerequirements.Doyouhavetherightskillset?Theearlypartsofaproject,boardbring-up,requiredetailedknowledgeofLinuxandhowitrelatestoyourhardware.Likewise,whendebuggingandtuningyourapplication,youwillneedtobeabletointerprettheresults.Ifyoudon'thavetheskillsin-house,youmaywanttooutsourcesomeofthework.Ofcourse,readingthisbookhelps!Isyoursystemreal-time?Linuxcanhandlemanyreal-timeactivitiessolongasyoupayattentiontocertaindetails,whichIwillcoverindetailinChapter16,Real-TimeProgramming.
Considerthesepointscarefully.ProbablythebestindicatorofsuccessistolookaroundforsimilarproductsthatrunLinuxandseehowtheyhavedoneit;followbestpractice.
TheplayersWheredoesopensourcesoftwarecomefrom?Whowritesit?Inparticular,howdoesthisrelatetothekeycomponentsofembeddeddevelopment—thetoolchain,bootloader,kernel,andbasicutilitiesfoundintherootfilesystem?
Themainplayersare:
Theopensourcecommunity:This,afterall,istheenginethatgeneratesthesoftwareyouaregoingtobeusing.Thecommunityisalooseallianceofdevelopers,manyofwhomarefundedinsomeway,perhapsbyanot-for-profitorganization,anacademicinstitution,oracommercialcompany.Theyworktogethertofurthertheaimsofthevariousprojects.Therearemanyofthem—somesmall,somelarge.SomethatwewillbemakinguseofintheremainderofthisbookareLinuxitself,U-Boot,BusyBox,Buildroot,theYoctoProject,andthemanyprojectsundertheGNUumbrella.CPUarchitects:ThesearetheorganizationsthatdesigntheCPUsweuse.TheimportantoneshereareARM/Linaro(ARM-basedSoCs),Intel(x86andx86_64),ImaginationTechnologies(MIPS),andIBM(PowerPC).Theyimplementor,attheveryleast,influencesupportforthebasicCPUarchitecture.SoCvendors(Atmel,Broadcom,Intel,Qualcomm,TI,andmanyothers).TheytakethekernelandtoolchainfromtheCPUarchitectsandmodifythemtosupporttheirchips.Theyalsocreatereferenceboards:designsthatareusedbythenextleveldowntocreatedevelopmentboardsandworkingproducts.BoardvendorsandOEMs:ThesepeopletakethereferencedesignsfromSoCvendorsandbuildthemintospecificproducts,forinstance,set-top-boxesorcameras,orcreatemoregeneralpurposedevelopmentboards,suchasthosefromAvantechandKontron.AnimportantcategoryarethecheapdevelopmentboardssuchasBeagleBoard/BeagleBoneandRaspberryPithathavecreatedtheirownecosystemsofsoftwareandhardwareadd-ons.
Theseformachain,withyourprojectusuallyattheend,whichmeansthatyou
donothaveafreechoiceofcomponents.Youcannotsimplytakethelatestkernelfromhttps://www.kernel.org/,exceptinafewrarecases,becauseitdoesnothavesupportforthechiporboardthatyouareusing.
Thisisanongoingproblemwithembeddeddevelopment.Ideally,thedevelopersateachlinkinthechainwouldpushtheirchangesupstream,buttheydon't.Itisnotuncommontofindakernelwhichhasmanythousandsofpatchesthatarenotmerged.Inaddition,SoCvendorstendtoactivelydevelopopensourcecomponentsonlyfortheirlatestchips,meaningthatsupportforanychipmorethanacoupleofyearsoldwillbefrozenandnotreceiveanyupdates.
Theconsequenceisthatmostembeddeddesignsarebasedonoldversionsofsoftware.Theydonotreceivesecurityfixes,performanceenhancements,orfeaturesthatareinnewerversions.ProblemssuchasHeartbleed(abugintheOpenSSLlibraries)andShellShock(abuginthebashshell)gounfixed.Iwilltalkmoreaboutthislaterinthischapterunderthetopicofsecurity.
Whatcanyoudoaboutit?First,askquestionsofyourvendors:whatistheirupdatepolicy,howoftendotheyrevisekernelversions,whatisthecurrentkernelversion,whatwastheonebeforethat,andwhatistheirpolicyformergingchangesup-stream?Somevendorsaremakinggreatstridesinthisway.Youshouldprefertheirchips.
Secondly,youcantakestepstomakeyourselfmoreself-sufficient.Thechaptersinsection1explainthedependenciesinmoredetailandshowyouwhereyoucanhelpyourself.Don'tjusttakethepackageofferedtoyoubytheSoCorboardvendoranduseitblindlywithoutconsideringthealternatives.
ProjectlifecycleThisbookisdividedintofoursectionsthatreflectthephasesofaproject.Thephasesarenotnecessarilysequential.Usuallytheyoverlapandyouwillneedtojumpbacktorevisitthingsthatweredonepreviously.However,theyarerepresentativeofadeveloper'spreoccupationsastheprojectprogresses:
ElementsofembeddedLinux(Chapters1to6)willhelpyousetupthedevelopmentenvironmentandcreateaworkingplatformforthelaterphases.Itisoftenreferredtoastheboardbring-upphase.Systemarchitectureanddesignchoices(Chapters7to11)willhelpyoutolookatsomeofthedesigndecisionsyouwillhavetomakeconcerningthestorageofprogramsanddata,howtodivideworkbetweenkerneldevicedriversandapplications,andhowtoinitializethesystem.Writingembeddedapplications(Chapters12and13)showshowtomakeeffectiveuseoftheLinuxprocessandthreadsmodel,andhowtomanagememoryinaresource-constraineddevice.Debuggingandoptimizingperformance(Chapters14and15)describeshowtotrace,profile,anddebugyourcodeinboththeapplicationsandthekernel.
Thefifthsectiononreal-time(Chapter16,Real-TimeProgramming)standssomewhatalonebecauseitisasmall,butimportant,categoryofembeddedsystems.Designingforreal-timebehaviorhasanimpactoneachofthefourmainphases.
ThefourelementsofembeddedLinuxEveryprojectbeginsbyobtaining,customizing,anddeployingthesefourelements:thetoolchain,thebootloader,thekernel,andtherootfilesystem.Thisisthetopicofthefirstsectionofthisbook.
Toolchain:Thecompilerandothertoolsneededtocreatecodeforyourtargetdevice.Everythingelsedependsonthetoolchain.Bootloader:TheprogramthatinitializestheboardandloadstheLinuxkernel.Kernel:Thisistheheartofthesystem,managingsystemresourcesandinterfacingwithhardware.Rootfilesystem:Containsthelibrariesandprogramsthatarerunoncethekernelhascompleteditsinitialization.
Ofcourse,thereisalsoafifthelement,notmentionedhere.Thatisthecollectionofprogramsspecifictoyourembeddedapplicationwhichmakethedevicedowhateveritissupposedtodo,beitweighgroceries,displaymovies,controlarobot,orflyadrone.
Typically,youwillbeofferedsomeoralloftheseelementsasapackagewhenyoubuyyourSoCorboard.But,forthereasonsmentionedintheprecedingparagraph,theymaynotbethebestchoicesforyou.IwillgiveyouthebackgroundtomaketherightselectionsinthefirstsixchaptersandIwillintroduceyoutotwotoolsthatautomatethewholeprocessforyou:BuildrootandtheYoctoProject.
OpensourceThecomponentsofembeddedLinuxareopensource,sonowisagoodtimetoconsiderwhatthatmeans,whyopensourcesworkthewaytheydo,andhowthisaffectstheoftenproprietaryembeddeddeviceyouwillbecreatingfromit.
LicensesWhentalkingaboutopensource,thewordfreeisoftenused.Peoplenewtothesubjectoftentakeittomeannothingtopay,andopensourcesoftwarelicensesdoindeedguaranteethatyoucanusethesoftwaretodevelopanddeploysystemsfornocharge.However,themoreimportantmeaninghereisfreedom,sinceyouarefreetoobtainthesourcecode,modifyitinanywayyouseefit,andredeployitinothersystems.Theselicensesgiveyouthisright.Comparethatwithsharewarelicenseswhichallowyoutocopythebinariesfornocostbutdonotgiveyouthesourcecode,orotherlicensesthatallowyoutousethesoftwareforfreeundercertaincircumstances,forexample,forpersonalusebutnotcommercial.Thesearenotopensource.
Iwillprovidethefollowingcommentsintheinterestofhelpingyouunderstandtheimplicationsofworkingwithopensourcelicenses,butIwouldliketopointoutthatIamanengineerandnotalawyer.Whatfollowsismyunderstandingofthelicensesandthewaytheyareinterpreted.
Opensourcelicensesfallbroadlyintotwocategories:thecopyleftlicensessuchastheGeneralPublicLicense(GPL)andthepermissivelicensessuchasthosefromtheBerkeleySoftwareDistribution(BSD),theApacheFoundation,andothers.
Thepermissivelicensessay,inessence,thatyoumaymodifythesourcecodeanduseitinsystemsofyourownchoosingsolongasyoudonotmodifythetermsofthelicenseinanyway.Inotherwords,withthatonerestriction,youcandowithitwhatyouwant,includingbuildingitintopossiblyproprietarysystems.
TheGPLlicensesaresimilar,buthaveclauseswhichcompelyoutopasstherightstoobtainandmodifythesoftwareontoyourendusers.Inotherwords,youshareyoursourcecode.Oneoptionistomakeitcompletelypublicbyputtingitontoapublicserver.Anotheristoofferitonlytoyourendusersbymeansofawrittenoffertoprovidethecodewhenrequested.TheGPLgoesfurthertosaythatyoucannotincorporateGPLcodeintoproprietaryprograms.AnyattempttodosowouldmaketheGPLapplytothewhole.Inotherwords,
youcannotcombineaGPLandproprietarycodeinoneprogram.
So,whataboutlibraries?IftheyarelicensedwiththeGPL,anyprogramlinkedwiththembecomesGPLalso.However,mostlibrariesarelicensedundertheLesserGeneralPublicLicense(LGPL).Ifthisisthecase,youareallowedtolinkwiththemfromaproprietaryprogram.
AlltheprecedingdescriptionrelatesspecificallytoGLPv2andLGPLv2.1.IshouldmentionthelatestversionsofGLPv3andLGPLv3.Thesearecontroversial,andIwilladmitthatIdon'tfullyunderstandtheimplications.However,theintentionistoensurethattheGPLv3andLGPLv3componentsinanysystemcanbereplacedbytheenduser,whichisinthespiritofopensourcesoftwareforeveryone.Itdoesposesomeproblemsthough.SomeLinuxdevicesareusedtogainaccesstoinformationaccordingtoasubscriptionleveloranotherrestriction,andreplacingcriticalpartsofthesoftwaremaycompromisethat.Set-top-boxesfitintothiscategory.Therearealsoissueswithsecurity.Iftheownerofadevicehasaccesstothesystemcode,thensomightanunwelcomeintruder.Oftenthedefenseistohavekernelimagesthataresignedbyanauthority,thevendor,sothatunauthorizedupdatesarenotpossible.Isthataninfringementofmyrighttomodifymydevice?Opinionsdiffer.
TheTiVoset-top-boxisanimportantpartofthisdebate.ItusesaLinuxkernel,whichislicensedunderGPLv2.TiVohavereleasedthesourcecodeoftheirversionofthekernelandsocomplywiththelicense.TiVoalsohasabootloaderthatwillonlyloadakernelbinarythatissignedbythem.Consequently,youcanbuildamodifiedkernelforaTiVoboxbutyoucannotloaditonthehardware.TheFreeSoftwareFoundation(FSF)takesthepositionthatthisisnotinthespiritofopensourcesoftwareandreferstothisprocedureasTivoization.TheGPLv3andLGPLv3werewrittentoexplicitlypreventthishappening.Someprojects,theLinuxkernelinparticular,havebeenreluctanttoadopttheversionthreelicensesbecauseoftherestrictionsitwouldplaceondevicemanufacturers.
HardwareforembeddedLinuxIfyouaredesigningorselectinghardwareforanembeddedLinuxproject,whatdoyoulookoutfor?
Firstly,aCPUarchitecturethatissupportedbythekernel—unlessyouplantoaddanewarchitectureyourself,ofcourse!LookingatthesourcecodeforLinux4.9,thereare31architectures,eachrepresentedbyasub-directoryinthearch/directory.Theyareall32-or64-bitarchitectures,mostwithamemorymanagementunit(MMU),butsomewithout.TheonesmostoftenfoundinembeddeddevicesareARM,MIPSPowerPC,andX86,eachin32-and64-bitvariants,andallofwhichhavememorymanagementunits.
Mostofthisbookiswrittenwiththisclassofprocessorinmind.Thereisanothergroupthatdoesn'thaveanMMUthatrunsasubsetofLinuxknownasmicrocontrollerLinuxoruClinux.TheseprocessorarchitecturesincludeARC,Blackfin,MicroBlaze,andNios.IwillmentionuClinuxfromtimetotimebutIwillnotgointodetailbecauseitisaratherspecializedtopic.
Secondly,youwillneedareasonableamountofRAM.16MiBisagoodminimum,althoughitisquitepossibletorunLinuxusinghalfthat.ItisevenpossibletorunLinuxwith4MiBifyouarepreparedtogotothetroubleofoptimizingeverypartofthesystem.Itmayevenbepossibletogetlower,buttherecomesapointatwhichitisnolongerLinux.
Thirdly,thereisnon-volatilestorage,usuallyflashmemory.8MiBisenoughforasimpledevicesuchasawebcamorasimplerouter.AswithRAM,youcancreateaworkableLinuxsystemwithlessstorageifyoureallywantto,buttheloweryougo,theharderitbecomes.Linuxhasextensivesupportforflashstoragedevices,includingrawNORandNANDflashchips,andmanagedflashintheformofSDcards,eMMCchips,USBflashmemory,andsoon.
Fourthly,adebugportisveryuseful,mostcommonlyanRS-232serialport.Itdoesnothavetobefittedonproductionboards,butmakesboardbring-up,debugging,anddevelopmentmucheasier.
Fifthly,youneedsomemeansofloadingsoftwarewhenstartingfromscratch.Afewyearsago,boardswouldhavebeenfittedwithaJointTestActionGroup(JTAG)interfaceforthispurpose,butmodernSoCshavetheabilitytoloadbootcodedirectlyfromremovablemedia,especiallySDandmicroSDcards,orserialinterfacessuchasRS-232orUSB.
Inadditiontothesebasics,thereareinterfacestothespecificbitsofhardwareyourdeviceneedstogetitsjobdone.MainlineLinuxcomeswithopensourcedriversformanythousandsofdifferentdevices,andtherearedrivers(ofvariablequality)fromtheSoCmanufacturerandfromtheOEMsofthird-partychipsthatmaybeincludedinthedesign,butremembermycommentsonthecommitmentandabilityofsomemanufacturers.Asadeveloperofembeddeddevices,youwillfindthatyouspendquitealotoftimeevaluatingandadaptingthird-partycode,ifyouhaveit,orliaisingwiththemanufacturerifyoudon't.Finally,youwillhavetowritethedevicesupportforinterfacesthatareuniquetothedevice,orfindsomeonetodoitforyou.
HardwareusedinthisbookTheworkedexamplesinthisbookareintendedtobegeneric,buttomakethemrelevantandeasytofollow,Ihavehadtochoosespecifichardware.Ihavechosentwoexemplardevices:theBeagleBoneBlackandQEMU.Thefirstisawidely-availableandcheapdevelopmentboardwhichcanbeusedinseriousembeddedhardware.Thesecondisamachineemulatorthatcanbeusedtocreatearangeofsystemsthataretypicalofembeddedhardware.ItwastemptingtouseQEMUexclusively,but,likeallemulations,itisnotquitethesameastherealthing.UsingaBeagleBoneBlack,youhavethesatisfactionofinteractingwithrealhardwareandseeingrealLEDsflash.Icouldhaveselectedaboardthatismoreup-to-datethantheBeagleBoneBlack,whichisseveralyearsoldnow,butIbelievethatitspopularitygivesitadegreeoflongevityanditmeansthatitwillcontinuetobeavailableforsomeyearsyet.
Inanycase,Iencourageyoutotryoutasmanyoftheexamplesasyoucan,usingeitherofthesetwoplatforms,orindeedanyembeddedhardwareyoumayhavetohand.
TheBeagleBoneBlackTheBeagleBoneandthelaterBeagleBoneBlackareopenhardwaredesignsforasmall,creditcardsizeddevelopmentboardproducedbyCircuitCoLLC.Themainrepositoryofinformationisathttps://beagleboard.org/.Themainpointsofthespecificationsare:
TIAM335x1GHzARM®Cortex-A8SitaraSoC512MiBDDR3RAM2or4GiB8-biteMMCon-boardflashstorageSerialportfordebuganddevelopmentMicroSDconnector,whichcanbeusedasthebootdeviceMiniUSBOTGclient/hostportthatcanalsobeusedtopowertheboardFullsizeUSB2.0hostport10/100EthernetportHDMIforvideoandaudiooutput
Inaddition,therearetwo46-pinexpansionheadersforwhichthereareagreatvarietyofdaughterboards,knownascapes,whichallowyoutoadapttheboardtodomanydifferentthings.However,youdonotneedtofitanycapesintheexamplesinthisbook.
Inadditiontotheboarditself,youwillneed:
AminiUSBtofull-sizeUSBcable(suppliedwiththeboard)toprovidepower,unlessyouhavethelastitemonthislist.AnRS-232cablethatcaninterfacewiththe6-pin3.3VTTLlevelsignalsprovidedbytheboard.TheBeagleboardwebsitehaslinkstocompatiblecables.AmicroSDcardandameansofwritingtoitfromyourdevelopmentPCorlaptop,whichwillbeneededtoloadsoftwareontotheboard.AnEthernetcable,assomeoftheexamplesrequirenetworkconnectivity.Optional,butrecommended,a5Vpowersupplycapableofdelivering1Aormore.
QEMUQEMUisamachineemulator.Itcomesinanumberofdifferentflavors,eachofwhichcanemulateaprocessorarchitectureandanumberofboardsbuiltusingthatarchitecture.Forexample,wehavethefollowing:
qemu-system-arm:ARMqemu-system-mips:MIPSqemu-system-ppc:PowerPCqemu-system-x86:x86andx86_64
Foreacharchitecture,QEMUemulatesarangeofhardware,whichyoucanseebyusingtheoption—machinehelp.Eachmachineemulatesmostofthehardwarethatwouldnormallybefoundonthatboard.Thereareoptionstolinkhardwaretolocalresources,suchasusingalocalfilefortheemulateddiskdrive.Hereisaconcreteexample:
$qemu-system-arm-machinevexpress-a9-m256M-drive
file=rootfs.ext4,sd-netnic-netuse-kernelzImage-dtbvexpress-
v2p-ca9.dtb-append"console=ttyAMA0,115200root=/dev/mmcblk0"-
serialstdio-netnic,model=lan9118-nettap,ifname=tap0
Theoptionsusedintheprecedingcommandlineare:
-machinevexpress-a9:CreatesanemulationofanARMVersatileExpressdevelopmentboardwithaCortexA-9processor-m256M:Populatesitwith256MiBofRAM-drivefile=rootfs.ext4,sd:ConnectstheSDinterfacetothelocalfilerootfs.ext4(whichcontainsafilesystemimage)-kernelzImage:LoadstheLinuxkernelfromthelocalfilenamedzImage-dtbvexpress-v2p-ca9.dtb:Loadsthedevicetreefromthelocalfilevexpress-v2p-ca9.dtb
-append"...":Suppliesthisstringasthekernelcommand-line-serialstdio:ConnectstheserialporttotheterminalthatlaunchedQEMU,usuallysothatyoucanlogontotheemulatedmachineviatheserialconsole-netnic,model=lan9118:Createsanetworkinterface
-nettap,ifname=tap0:Connectsthenetworkinterfacetothevirtualnetworkinterfacetap0
Toconfigurethehostsideofthenetwork,youneedthetunctlcommandfromtheUserModeLinux(UML)project;onDebianandUbuntu,thepackageisnameduml-utilites:
$sudotunctl-u$(whoami)-ttap0
Thiscreatesanetworkinterfacenamedtap0whichisconnectedtothenetworkcontrollerintheemulatedQEMUmachine.Youconfiguretap0inexactlythesamewayasanyotherinterface.
Alloftheseoptionsaredescribedindetailinthefollowingchapters.IwillbeusingVersatileExpressformostofmyexamples,butitshouldbeeasytouseadifferentmachineorarchitecture.
SoftwareusedinthisbookIhaveusedonlyopensourcesoftware,bothforthedevelopmenttoolsandthetargetoperatingsystemandapplications.IassumethatyouwillbeusingLinuxonyourdevelopmentsystem.ItestedallthehostcommandsusingUbuntu14.04andsothereisaslightbiastowardsthatparticularversion,butanymodernLinuxdistributionislikelytoworkjustfine.
SummaryEmbeddedhardwarewillcontinuetogetmorecomplex,followingthetrajectorysetbyMoore'sLaw.Linuxhasthepowerandtheflexibilitytomakeuseofhardwareinanefficientway.
Linuxisjustonecomponentofopensourcesoftwareoutofthemanythatyouneedtocreateaworkingproduct.Thefactthatthecodeisfreelyavailablemeansthatpeopleandorganizationsatmanydifferentlevelscancontribute.However,thesheervarietyofembeddedplatformsandthefastpaceofdevelopmentleadtoisolatedpoolsofsoftwarewhicharenotsharedasefficientlyastheyshouldbe.Inmanycases,youwillbecomedependentonthissoftware,especiallytheLinuxkernelthatisprovidedbyyourSoCorBoardvendor,andtoalesserextent,thetoolchain.SomeSoCmanufacturersaregettingbetteratpushingtheirchangesupstreamandthemaintenanceofthesechangesisgettingeasier.
Fortunately,therearesomepowerfultoolsthatcanhelpyoucreateandmaintainthesoftwareforyourdevice.Forexample,BuildrootisidealforsmallsystemsandtheYoctoProjectforlargerones.BeforeIdescribethesebuildtools,IwilldescribethefourelementsofembeddedLinux,whichyoucanapplytoallembeddedLinuxprojects,howevertheyarecreated.
Thenextchapterisallaboutthefirstofthese,thetoolchain,whichyouneedtocompilecodeforyourtargetplatform.
LearningAboutToolchainsThetoolchainisthefirstelementofembeddedLinuxandthestartingpointofyourproject.Youwilluseittocompileallthecodethatwillrunonyourdevice.Thechoicesyoumakeatthisearlystagewillhaveaprofoundimpactonthefinaloutcome.Yourtoolchainshouldbecapableofmakingeffectiveuseofyourhardwarebyusingtheoptimuminstructionsetforyourprocessor.Itshouldsupportthelanguagesthatyourequire,andhaveasolidimplementationofthePortableOperatingSystemInterface(POSIX)andothersysteminterfaces.Notonlythat,butitshouldbeupdatedwhensecurityflawsarediscoveredorbugsarefound.Finally,itshouldbeconstantthroughouttheproject.Inotherwords,onceyouhavechosenyourtoolchain,itisimportanttostickwithit.Changingcompilersanddevelopmentlibrariesinaninconsistentwayduringaprojectwillleadtosubtlebugs.
ObtainingatoolchaincanbeassimpleasdownloadingandinstallingaTARfile,oritcanbeascomplexasbuildingthewholethingfromsourcecode.Inthischapter,Itakethelatterapproach,withthehelpofatoolcalledcrosstool-NG,sothatIcanshowyouthedetailsofcreatingatoolchain.LateroninChapter6,SelectingaBuildSystem,Iwillswitchtousingthetoolchaingeneratedbythebuildsystem,whichisthemoreusualmeansofobtainingatoolchain.
Inthischapter,wewillcoverthefollowingtopics:
IntroducingtoolchainsFindingatoolchainBuildingatoolchainusingthecrosstool-NGtoolAnatomyofatoolchainLinkingwithlibraries--staticanddynamiclinkingTheartofcrosscompiling
IntroducingtoolchainsAtoolchainisthesetoftoolsthatcompilessourcecodeintoexecutablesthatcanrunonyourtargetdevice,andincludesacompiler,alinker,andrun-timelibraries.InitiallyyouneedonetobuildtheotherthreeelementsofanembeddedLinuxsystem:thebootloader,thekernel,andtherootfilesystem.Ithastobeabletocompilecodewritteninassembly,C,andC++sincethesearethelanguagesusedinthebaseopensourcepackages.
Usually,toolchainsforLinuxarebasedoncomponentsfromtheGNUproject(http://www.gnu.org),andthatisstilltrueinthemajorityofcasesatthetimeofwriting.However,overthepastfewyears,theClangcompilerandtheassociatedLowLevelVirtualMachine(LLVM)project(http://llvm.org)haveprogressedtothepointthatitisnowaviablealternativetoaGNUtoolchain.OnemajordistinctionbetweenLLVMandGNU-basedtoolchainsisthelicensing;LLVMhasaBSDlicensewhileGNUhastheGPL.TherearesometechnicaladvantagestoClangaswell,suchasfastercompilationandbetterdiagnostics,butGNUGCChastheadvantageofcompatibilitywiththeexistingcodebaseandsupportforawiderangeofarchitecturesandoperatingsystems.Indeed,therearestillsomeareaswhereClangcannotreplacetheGNUCcompiler,especiallywhenitcomestocompilingamainlineLinuxkernel.Itisprobablethat,inthenextyearorso,ClangwillbeabletocompileallthecomponentsneededforembeddedLinuxandsowillbecomeanalternativetoGNU.ThereisagooddescriptionofhowtouseClangforcrosscompilationathttp://clang.llvm.org/docs/CrossCompilation.html.IfyouwouldliketouseitaspartofanembeddedLinuxbuildsystem,theEmbToolkit(https://www.embtoolkit.org)fullysupportsbothGNUandLLVM/Clangtoolchains,andvariouspeopleareworkingonusingClangwithBuildrootandtheYoctoProject.IwillcoverembeddedbuildsystemsinChapter6,SelectingaBuildSystem.Meanwhile,thischapterfocusesontheGNUtoolchainasitistheonlycompleteoptionatthistime.
AstandardGNUtoolchainconsistsofthreemaincomponents:
Binutils:Asetofbinaryutilitiesincludingtheassemblerandthelinker.Itisavailableathttp://www.gnu.org/software/binutils.
GNUCompilerCollection(GCC):ThesearethecompilersforCandotherlanguageswhich,dependingontheversionofGCC,includeC++,Objective-C,Objective-C++,Java,Fortran,Ada,andGo.Theyalluseacommonbackendwhichproducesassemblercode,whichisfedtotheGNUassembler.Itisavailableathttp://gcc.gnu.org/.Clibrary:Astandardizedapplicationprograminterface(API)basedonthePOSIXspecification,whichisthemaininterfacetotheoperatingsystemkernelforapplications.ThereareseveralClibrariestoconsider,asweshallseelateroninthischapter.
Aswellasthese,youwillneedacopyoftheLinuxkernelheaders,whichcontaindefinitionsandconstantsthatareneededwhenaccessingthekerneldirectly.Rightnow,youneedthemtobeabletocompiletheClibrary,butyouwillalsoneedthemlaterwhenwritingprogramsorcompilinglibrariesthatinteractwithparticularLinuxdevices,forexample,todisplaygraphicsviatheLinuxframebufferdriver.Thisisnotsimplyaquestionofmakingacopyoftheheaderfilesintheincludedirectoryofyourkernelsourcecode.ThoseheadersareintendedforuseinthekernelonlyandcontaindefinitionsthatwillcauseconflictsifusedintheirrawstatetocompileregularLinuxapplications.
Instead,youwillneedtogenerateasetofsanitizedkernelheaders,whichIhaveillustratedinChapter5,BuildingaRootFilesystem.
ItisnotusuallycrucialwhetherthekernelheadersaregeneratedfromtheexactversionofLinuxyouaregoingtobeusingornot.Sincethekernelinterfacesarealwaysbackwards-compatible,itisonlynecessarythattheheadersarefromakernelthatisthesameas,orolderthan,theoneyouareusingonthetarget.
MostpeoplewouldconsidertheGNUDebugger(GDB)tobepartofthetoolchainaswell,anditisusualthatitisbuiltatthispoint.IwilltalkaboutGDBinChapter14,DebuggingwithGDB.
TypesoftoolchainsForourpurposes,therearetwotypesoftoolchain:
Native:Thistoolchainrunsonthesametypeofsystem(sometimesthesameactualsystem)astheprogramsitgenerates.Thisistheusualcasefordesktopsandservers,anditisbecomingpopularoncertainclassesofembeddeddevices.TheRaspberryPirunningDebianforARM,forexample,hasself-hostednativecompilers.Cross:Thistoolchainrunsonadifferenttypeofsystemthanthetarget,allowingthedevelopmenttobedoneonafastdesktopPCandthenloadedontotheembeddedtargetfortesting.
AlmostallembeddedLinuxdevelopmentisdoneusingacrossdevelopmenttoolchain,partlybecausemostembeddeddevicesarenotwellsuitedtoprogramdevelopmentsincetheylackcomputingpower,memory,andstorage,butalsobecauseitkeepsthehostandtargetenvironmentsseparate.Thelatterpointisespeciallyimportantwhenthehostandthetargetareusingthesamearchitecture,x86_64,forexample.Inthiscase,itistemptingtocompilenativelyonthehostandsimplycopythebinariestothetarget.
Thisworksuptoapoint,butitislikelythatthehostdistributionwillreceiveupdatesmoreoftenthanthetarget,orthatdifferentengineersbuildingcodeforthetargetwillhaveslightlydifferentversionsofthehostdevelopmentlibraries.Overtime,thedevelopmentandtargetsystemswilldivergeandyouwillviolatetheprinciplethatthetoolchainshouldremainconstantthroughoutthelifeoftheproject.Youcanmakethisapproachworkifyouensurethatthehostandthetargetbuildenvironmentsareinlockstepwitheachother.However,amuchbetterapproachistokeepthehostandthetargetseparate,andacrosstoolchainisthewaytodothat.
However,thereisacounterargumentinfavorofnativedevelopment.Crossdevelopmentcreatestheburdenofcross-compilingallthelibrariesandtoolsthatyouneedforyourtarget.Wewillseelaterinthischapterthatcross-compilingisnotalwayssimplebecausemanyopensourcepackagesarenotdesignedtobe
builtinthisway.Integratedbuildtools,includingBuildrootandtheYoctoProject,helpbyencapsulatingtherulestocrosscompilearangeofpackagesthatyouneedintypicalembeddedsystems,butifyouwanttocompilealargenumberofadditionalpackages,thenitisbettertonativelycompilethem.Forexample,buildingaDebiandistributionfortheRaspberryPiorBeagleBoneusingacrosscompilerwouldbeveryhard.Instead,theyarenativelycompiled.Creatinganativebuildenvironmentfromscratchisnoteasy.Youwouldstillneedacrosscompileratfirsttocreatethenativebuildenvironmentonthetarget,whichyouthenusetobuildthepackages.Then,inordertoperformthenativebuildinareasonableamountoftime,youwouldneedabuildfarmofwell-provisionedtargetboards,oryoumaybeabletouseQEMUtoemulatethetarget.
Meanwhile,inthischapter,Iwillfocusonthemoremainstreamcrosscompilerenvironment,whichisrelativelyeasytosetupandadminister.
CPUarchitecturesThetoolchainhastobebuiltaccordingtothecapabilitiesofthetargetCPU,whichincludes:
CPUarchitecture:ARM,MIPS,x86_64,andsoonBig-orlittle-endianoperation:SomeCPUscanoperateinbothmodes,butthemachinecodeisdifferentforeachFloatingpointsupport:Notallversionsofembeddedprocessorsimplementahardwarefloatingpointunit,inwhichcasethetoolchainhastobeconfiguredtocallasoftwarefloatingpointlibraryinsteadApplicationBinaryInterface(ABI):Thecallingconventionusedforpassingparametersbetweenfunctioncalls
Withmanyarchitectures,theABIisconstantacrossthefamilyofprocessors.OnenotableexceptionisARM.TheARMarchitecturetransitionedtotheExtendedApplicationBinaryInterface(EABI)inthelate2000s,resultinginthepreviousABIbeingnamedtheOldApplicationBinaryInterface(OABI).WhiletheOABIisnowobsolete,youcontinuetoseereferencestoEABI.Sincethen,theEABIhassplitintotwo,basedonthewaythefloatingpointparametersarepassed.TheoriginalEABIusesgeneralpurpose(integer)registers,whilethenewerExtendedApplicationBinaryInterfaceHard-Float(EABIHF)usesfloatingpointregisters.TheEABIHFissignificantlyfasteratfloatingpointoperations,sinceitremovestheneedforcopyingbetweenintegerandfloatingpointregisters,butitisnotcompatiblewithCPUsthatdonothaveafloatingpointunit.Thechoice,then,isbetweentwoincompatibleABIs;youcannotmixandmatchthetwo,andsoyouhavetodecideatthisstage.
GNUusesaprefixtothenameofeachtoolinthetoolchain,whichidentifiesthevariouscombinationsthatcanbegenerated.Itconsistsofatupleofthreeorfourcomponentsseparatedbydashes,asdescribedhere:
CPU:ThisistheCPUarchitecture,suchasARM,MIPS,orx86_64.IftheCPUhasbothendianmodes,theymaybedifferentiatedbyaddingelforlittle-endianorebforbig-endian.Goodexamplesarelittle-endianMIPS,
mipselandbig-endianARM,armeb.Vendor:Thisidentifiestheproviderofthetoolchain.Examplesincludebuildroot,poky,orjustunknown.Sometimesitisleftoutaltogether.Kernel:Forourpurposes,itisalwayslinux.Operatingsystem:Anamefortheuserspacecomponent,whichmightbegnuormusl.TheABImaybeappendedhereaswell,soforARMtoolchains,youmayseegnueabi,gnueabihf,musleabi,ormusleabihf.
Youcanfindthetupleusedwhenbuildingthetoolchainbyusingthe-dumpmachineoptionofgcc.Forexample,youmayseethefollowingonthehostcomputer:
$gcc-dumpmachine
x86_64-linux-gnu
Whenanativecompilerisinstalledonamachine,itisnormaltocreatelinkstoeachofthetoolsinthetoolchainwithnoprefixes,sothatyoucancalltheCcompilerwiththegcccommand.
Hereisanexampleusingacrosscompiler:
$mipsel-unknown-linux-gnu-gcc-dumpmachine
mipsel-unknown-linux-gnu
ChoosingtheClibraryTheprogramminginterfacetotheUnixoperatingsystemisdefinedintheClanguage,whichisnowdefinedbythePOSIXstandards.TheClibraryistheimplementationofthatinterface;itisthegatewaytothekernelforLinuxprograms,asshowninthefollowingdiagram.Evenifyouarewritingprogramsinanotherlanguage,maybeJavaorPython,therespectiverun-timesupportlibrarieswillhavetocalltheClibraryeventually,asshownhere:
WhenevertheClibraryneedstheservicesofthekernel,itwillusethekernelsystemcallinterfacetotransitionbetweenuserspaceandkernelspace.ItispossibletobypasstheClibrarybymakingthekernelsystemcallsdirectly,butthatisalotoftroubleandalmostnevernecessary.
ThereareseveralClibrariestochoosefrom.Themainoptionsareasfollows:
glibc:ThisisthestandardGNUClibrary,availableathttp://www.gnu.org/software/libc.Itisbigand,untilrecently,notveryconfigurable,butitisthemostcompleteimplementationofthePOSIXAPI.ThelicenseisLGPL2.1.musllibc:Thisisavailableathttps://www.musl-libc.org.Themusllibclibraryiscomparativelynew,buthasbeengainingalotofattentionasasmallandstandards-compliantalternativetoGNUlibc.ItisagoodchoiceforsystemswithalimitedamountofRAMandstorage.IthasanMITlicense.uClibc-ng:Thisisavailableathttps://uclibc-ng.org/.uisreallyaGreekmucharacter,indicatingthatthisisthemicrocontrollerClibrary.ItwasfirstdevelopedtoworkwithuClinux(LinuxforCPUswithoutmemorymanagementunits),buthassincebeenadaptedtobeusedwithfullLinux.TheuClibc-nglibraryisaforkoftheoriginaluClibcproject(https://uclibc.org/),
whichhasunfortunatelyfallenintodisrepair.BotharelicensedwithLGPL2.1.eglibc:Thisisavailableathttp://www.eglibc.org/home.Nowobsolete,eglibcwasaforkofglibcwithchangestomakeitmoresuitableforembeddedusage.Amongotherthings,eglibcaddedconfigurationoptionsandsupportforarchitecturesnotcoveredbyglibc,inparticularthePowerPCe500CPUcore.Thecodebasefromeglibcwasmergedbackintoglibcinversion2.20.Theeglibclibraryisnolongermaintained.
So,whichtochoose?MyadviceistouseuClibc-ngonlyifyouareusinguClinux.IfyouhaveverylimitedamountofstorageorRAM,thenmusllibcisagoodchoice,otherwise,useglibc,asshowninthisflowchart:
FindingatoolchainYouhavethreechoicesforyourcrossdevelopmenttoolchain:youmayfindareadybuilttoolchainthatmatchesyourneeds,youcanusetheonegeneratedbyanembeddedbuildtoolwhichiscoveredinChapter6,SelectingaBuildSystem,oryoucancreateoneyourselfasdescribedlaterinthischapter.
Apre-builtcrosstoolchainisanattractiveoptioninthatyouonlyhavetodownloadandinstallit,butyouarelimitedtotheconfigurationofthatparticulartoolchainandyouaredependentonthepersonororganizationyougotitfrom.Mostlikely,itwillbeoneofthese:
AnSoCorboardvendor.MostvendorsofferaLinuxtoolchain.Aconsortiumdedicatedtoprovidingsystem-levelsupportforagivenarchitecture.Forexample,Linaro,(https://www.linaro.org/)havepre-builttoolchainsfortheARMarchitecture.Athird-partyLinuxtoolvendor,suchasMentorGraphics,TimeSys,orMontaVista.ThecrosstoolpackagesforyourdesktopLinuxdistribution.Forexample,Debian-baseddistributionshavepackagesforcrosscompilingforARM,MIPS,andPowerPCtargets.AbinarySDKproducedbyoneoftheintegratedembeddedbuildtools.TheYoctoProjecthassomeexamplesathttp://downloads.yoctoproject.org/releases/yocto/yocto-[version]/toolchain.Alinkfromaforumthatyoucan'tfindanymore.
Inallofthesecases,youhavetodecidewhetherthepre-builttoolchainonoffermeetsyourrequirements.DoesitusetheClibraryyouprefer?Willtheprovidergiveyouupdatesforsecurityfixesandbugs,bearinginmindmycommentsonsupportandupdatesfromChapter1,StartingOut.Ifyouranswerisnotoanyofthese,thenyoushouldconsidercreatingyourown.
Unfortunately,buildingatoolchainisnoeasytask.Ifyoutrulywanttodothewholethingyourself,takealookatCrossLinuxFromScratch(http://trac.clfs.org).Thereyouwillfindstep-by-stepinstructionsonhowtocreateeachcomponent.
Asimpleralternativeistousecrosstool-NG,whichencapsulatestheprocessintoasetofscriptsandhasamenu-drivenfrontend.Youstillneedafairdegreeofknowledge,though,justtomaketherightchoices.
ItissimplerstilltouseabuildsystemsuchasBuildrootortheYoctoProject,sincetheygenerateatoolchainaspartofthebuildprocess.Thisismypreferredsolution,asIhaveshowninChapter6,SelectingaBuildSystem.
Buildingatoolchainusingcrosstool-NGSomeyearsago,DanKegelwroteasetofscriptsandmakefilesforgeneratingcrossdevelopmenttoolchainsandcalleditcrosstool(http://kegel.com/crosstool/).In2007,YannE.Morinusedthatbasetocreatethenextgenerationofcrosstool,crosstool-NG(http://crosstool-ng.github.io/).Todayitisbyfarthemostconvenientwaytocreateastand-alonecrosstoolchainfromsource.
Installingcrosstool-NGBeforeyoubegin,youwillneedaworkingnativetoolchainandbuildtoolsonyourhostPC.Toworkwithcrosstool-NGonanUbuntuhost,youwillneedtoinstallthepackagesusingthefollowingcommand:
$sudoapt-getinstallautomakebisonchrpathflexg++gitgperf\
gawklibexpat1-devlibncurses5-devlibsdl1.2-devlibtool\
python2.7-devtexinfo
Next,getthecurrentreleasefromthecrosstool-NGGitrepository.Inmyexamples,Ihaveusedversion1.22.0.Extractitandcreatethefrontendmenusystem,ct-ng,asshowninthefollowingcommands:
$gitclonehttps://github.com/crosstool-ng/crosstool-ng.git
$cdcrosstool-ng
$gitcheckoutcrosstool-ng-1.22.0
$./bootstrap
$./configure--enable-local
$make
$makeinstall
The--enable-localoptionmeansthattheprogramwillbeinstalledintothecurrentdirectory,whichavoidstheneedforrootpermissions,aswouldberequiredifyouweretoinstallitinthedefaultlocation/usr/local/bin.Type./ct-ngfromthecurrentdirectorytolaunchthecrosstoolmenu.
BuildingatoolchainforBeagleBoneBlackCrosstool-NGcanbuildmanydifferentcombinationsoftoolchains.Tomaketheinitialconfigurationeasier,itcomeswithasetofsamplesthatcovermanyofthecommonuse-cases.Use./ct-nglist-samplestogeneratethelist.
TheBeagleBoneBlackhasaTIAM335xSoC,whichcontainsanARMCortexA8coreandaVFPv3floatingpointunit.SincetheBeagleBoneBlackhasplentyofRAMandstorage,wecanuseglibcastheClibrary.Theclosestsampleisarm-cortex_a8-linux-gnueabi.Youcanseethedefaultconfigurationbyprefixingthenamewithshow-,asdemonstratedhere:
$./ct-ngshow-arm-cortex_a8-linux-gnueabi
[L..]arm-cortex_a8-linux-gnueabi
OS:linux-4.3
Companionlibs:gmp-6.0.0ampfr-3.1.3mpc-1.0.3libelf-0.8.13expat-2.1.0
ncurses-6.0
binutils:binutils-2.25.1
Ccompilers:gcc|5.2.0
Languages:C,C++
Clibrary:glibc-2.22(threads:nptl)
Tools:dmalloc-5.5.2duma-2_5_15gdb-7.10ltrace-0.7.3strace-4.10
Thisisaclosematchwithourrequirements,exceptthatitusingtheeabibinaryinterface,whichpassesfloatingpointargumentsinintegerregisters.Wewouldprefertousehardwarefloatingpointregistersforthatpurposebecauseitwouldspeedupfunctioncallsthathavefloatanddoubleparametertypes.Youcanchangetheconfigurationlateron,sofornowyoushouldselectthistargetconfiguration:
$./ct-ngarm-cortex_a8-linux-gnueabi
Atthispoint,youcanreviewtheconfigurationandmakechangesusingtheconfigurationmenucommandmenuconfig:
$./ct-ngmenuconfig
ThemenusystemisbasedontheLinuxkernelmenuconfig,andsonavigationoftheuserinterfacewillbefamiliartoanyonewhohasconfiguredakernel.Ifnot,
refertoChapter4,ConfiguringandBuildingtheKernelforadescriptionofmenuconfig.
TherearetwoconfigurationchangesthatIwouldrecommendyoumakeatthispoint:
InPathsandmiscoptions,disableRenderthetoolchainread-only(CT_INSTALL_DIR_RO)InTargetoptions|Floatingpoint,selecthardware(FPU)(CT_ARCH_FLOAT_HW)
Thefirstisnecessaryifyouwanttoaddlibrariestothetoolchainafterithasbeeninstalled,whichIdescribelaterinthischapter.Thesecondselectstheeabihfbinaryinterfaceforthereasonsdiscussedearlier.Thenamesinparenthesesaretheconfigurationlabelsstoredintheconfigurationfile.Whenyouhavemadethechanges,exitthemenuconfigmenuandsavetheconfigurationasyoudoso.
Nowyoucanusecrosstool-NGtoget,configure,andbuildthecomponentsaccordingtoyourspecification,bytypingthefollowingcommand:
$./ct-ngbuild
Thebuildwilltakeabouthalfanhour,afterwhichyouwillfindyourtoolchainispresentin~/x-tools/arm-cortex_a8-linux-gnueabihf.
BuildingatoolchainforQEMUOntheQEMUtarget,youwillbeemulatinganARM-versatilePBevaluationboardthathasanARM926EJ-Sprocessorcore,whichimplementstheARMv5TEinstructionset.Youneedtogenerateacrosstool-NGtoolchainthatmatcheswiththespecification.TheprocedureisverysimilartotheonefortheBeagleBoneBlack.
Youbeginbyrunning./ct-nglist-samplestofindagoodbaseconfigurationtoworkfrom.Thereisn'tanexactfit,souseagenerictarget,arm-unknown-linux-gnueabi.Youselectitasshown,runningdistcleanfirsttomakesurethattherearenoartifactsleftoverfromapreviousbuild:
$./ct-ngdistclean
$./ct-ngarm-unknown-linux-gnueabi
AswiththeBeagleBoneBlack,youcanreviewtheconfigurationandmakechangesusingtheconfigurationmenucommand./ct-ngmenuconfig.Thereisonlyonechangenecessary:
InPathsandmiscoptions,disableRenderthetoolchainread-only(CT_INSTALL_DIR_RO)
Now,buildthetoolchainwiththecommandasshownhere:
$./ct-ngbuild
Asbefore,thebuildwilltakeabouthalfanhour.Thetoolchainwillbeinstalledin~/x-tools/arm-unknown-linux-gnueabi.
AnatomyofatoolchainTogetanideaofwhatisinatypicaltoolchain,Iwanttoexaminethecrosstool-NGtoolchainyouhavejustcreated.TheexamplesusetheARMCortexA8toolchaincreatedfortheBeagleBoneBlack,whichhastheprefixarm-cortex_a8-linux-gnueabihf-.IfyoubuilttheARM926EJ-StoolchainfortheQEMUtarget,thentheprefixwillbearm-unknown-linux-gnueabiinstead.
TheARMCortexA8toolchainisinthedirectory~/x-tools/arm-cortex_a8-linux-gnueabihf/bin.Inthereyouwillfindthecrosscompiler,arm-cortex_a8-linux-gnueabihf-gcc.Tomakeuseofit,youneedtoaddthedirectorytoyourpathusingthefollowingcommand:
$PATH=~/x-tools/arm-cortex_a8-linux-gnueabihf/bin:$PATH
Nowyoucantakeasimplehelloworldprogram,whichintheClanguagelookslikethis:
#include<stdio.h>
#include<stdlib.h>
intmain(intargc,char*argv[])
{
printf("Hello,world!\n");
return0;
}
Youcompileitlikethis:
$arm-cortex_a8-linux-gnueabihf-gcchelloworld.c-ohelloworld
Youcanconfirmthatithasbeencrosscompiledbyusingthefilecommandtoprintthetypeofthefile:
$filehelloworld
helloworld:ELF32-bitLSBexecutable,ARM,EABI5version1(SYSV),dynamicallylinked(usessharedlibs),forGNU/Linux4.3.0,notstripped
FindingoutaboutyourcrosscompilerImaginethatyouhavejustreceivedatoolchainandthatyouwouldliketoknowmoreabouthowitwasconfigured.Youcanfindoutalotbyqueryinggcc.Forexample,tofindtheversion,youuse--version:
$arm-cortex_a8-linux-gnueabihf-gcc--version
arm-cortex_a8-linux-gnueabihf-gcc(crosstool-NGcrosstool-ng-1.22.0)5.2.0
Copyright(C)2015FreeSoftwareFoundation,Inc.
Thisisfreesoftware;seethesourceforcopyingconditions.ThereisNO
warranty;notevenforMERCHANTABILITYorFITNESSFORAPARTICULARPURPOSE.
Tofindhowitwasconfigured,use-v:
$arm-cortex_a8-linux-gnueabihf-gcc-v
Usingbuilt-inspecs.
COLLECT_GCC=arm-cortex_a8-linux-gnueabihf-gcc
COLLECT_LTO_WRAPPER=/home/chris/x-tools/arm-cortex_a8-linux-gnueabihf/libexec/gcc/arm-cortex_a8-linux-gnueabihf/5.2.0/lto-wrapper
Target:arm-cortex_a8-linux-gnueabihf
Configuredwith:/home/chris/crosstool-ng/.build/src/gcc-5.2.0/configure--build=x86_64-build_pc-linux-gnu--host=x86_64-build_pc-linux-gnu--target=arm-cortex_a8-linux-gnueabihf--prefix=/home/chris/x-tools/arm-cortex_a8-linux-gnueabihf--with-sysroot=/home/chris/x-tools/arm-cortex_a8-linux-gnueabihf/arm-cortex_a8-linux-gnueabihf/sysroot--enable-languages=c,c++--with-cpu=cortex-a8--with-float=hard--with-pkgversion='crosstool-NGcrosstool-ng-1.22.0'--enable-__cxa_atexit--disable-libmudflap--disable-libgomp--disable-libssp--disable-libquadmath--disable-libquadmath-support--disable-libsanitizer--with-gmp=/media/chris/android/home/training/MELP/ch02/crosstool-ng/.build/arm-cortex_a8-linux-gnueabihf/buildtools--with-mpfr=/media/chris/android/home/training/MELP/ch02/crosstool-ng/.build/arm-cortex_a8-linux-gnueabihf/buildtools--with-mpc=/media/chris/android/home/training/MELP/ch02/crosstool-ng/.build/arm-cortex_a8-linux-gnueabihf/buildtools--with-isl=/media/chris/android/home/training/MELP/ch02/crosstool-ng/.build/arm-cortex_a8-linux-gnueabihf/buildtools--with-cloog=/media/chris/android/home/training/MELP/ch02/crosstool-ng/.build/arm-cortex_a8-linux-gnueabihf/buildtools--with-libelf=/media/chris/android/home/training/MELP/ch02/crosstool-ng/.build/arm-cortex_a8-linux-gnueabihf/buildtools--enable-lto--with-host-libstdcxx='-static-libgcc-Wl,-Bstatic,-lstdc++,-Bdynamic-lm'--enable-threads=posix--enable-target-optspace--enable-plugin--enable-gold--disable-nls--disable-multilib--with-local-prefix=/home/chris/x-tools/arm-cortex_a8-linux-gnueabihf/arm-cortex_a8-linux-gnueabihf/sysroot--enable-long-long
Threadmodel:posix
gccversion5.2.0(crosstool-NGcrosstool-ng-1.22.0)
Thereisalotofoutputthere,buttheinterestingthingstonoteare:
--with-sysroot=/home/chris/x-tools/arm-cortex_a8-linux-gnueabihf/arm-cortex_a8-
linux-gnueabihf/sysroot:Thisisthedefaultsysrootdirectory;seethefollowingsectionforanexplanation--enable-languages=c,c++:Usingthis,wehavebothCandC++languagesenabled--with-cpu=cortex-a8:ThecodeisgeneratedforanARMCortexA8core--with-float=hard:GeneratesopcodesforthefloatingpointunitandusestheVFPregistersforparameters--enable-threads=posix:ThisenablesthePOSIXthreads
Thesearethedefaultsettingsforthecompiler.Youcanoverridemostofthemonthegcccommandline.Forexample,ifyouwanttocompileforadifferentCPU,youcanoverridetheconfiguredsetting,–-with-cpu,byadding-mcputothecommandline,asfollows:
$arm-cortex_a8-linux-gnueabihf-gcc-mcpu=cortex-a5helloworld.c\
-ohelloworld
Youcanprintouttherangeofarchitecture-specificoptionsavailableusing--target-help,asfollows:
$arm-cortex_a8-linux-gnueabihf-gcc--target-help
Youmaybewonderingifitmattersthatyougettheconfigurationexactlyrightatthispoint,sinceyoucanalwayschangeitasshownhere.Theanswerdependsonthewayyouanticipateusingit.Ifyouplantocreateanewtoolchainforeachtarget,thenitmakessensetoseteverythingupatthebeginning,becauseitwillreducetherisksofgettingitwronglateron.JumpingaheadalittletoChapter6,SelectingaBuildSystem,IcallthistheBuildrootphilosophy.If,ontheotherhand,youwanttobuildatoolchainthatisgenericandyouarepreparedtoprovidethecorrectsettingswhenyoubuildforaparticulartarget,thenyoushouldmakethebasetoolchaingeneric,whichisthewaytheYoctoProjecthandlesthings.TheprecedingexamplesfollowtheBuildrootphilosophy.
Thesysroot,library,andheaderfilesThetoolchainsysrootisadirectorywhichcontainssubdirectoriesforlibraries,headerfiles,andotherconfigurationfiles.Itcanbesetwhenthetoolchainisconfiguredthrough--with-sysroot=,oritcanbesetonthecommandlineusing--sysroot=.Youcanseethelocationofthedefaultsysrootbyusing-print-sysroot:
$arm-cortex_a8-linux-gnueabihf-gcc-print-sysroot
/home/chris/x-tools/arm-cortex_a8-linux-gnueabihf/arm-cortex_a8-linux-gnueabihf/sysroot
Youwillfindthefollowingsubdirectoriesinsysroot:
lib:ContainsthesharedobjectsfortheClibraryandthedynamiclinker/loader,ld-linuxusr/lib,thestaticlibraryarchivefilesfortheClibrary,andanyotherlibrariesthatmaybeinstalledsubsequentlyusr/include:Containstheheadersforallthelibrariesusr/bin:Containstheutilityprogramsthatrunonthetarget,suchasthelddcommanduse/share:Usedforlocalizationandinternationalizationsbin:Providestheldconfigutility,usedtooptimizelibraryloadingpaths
Plainly,someoftheseareneededonthedevelopmenthosttocompileprograms,andothers-forexample,thesharedlibrariesandld-linux-areneededonthetargetatruntime.
OthertoolsinthetoolchainThefollowingtableshowsvariousothercomponentsofaGNUtoolchain,togetherwithabriefdescription:
Command Description
addr2line
Convertsprogramaddressesintofilenamesandnumbersbyreadingthedebugsymboltablesinanexecutablefile.Itisveryusefulwhendecodingaddressesprintedoutinasystemcrashreport.
ar Thearchiveutilityisusedtocreatestaticlibraries.as ThisistheGNUassembler.c++filt ThisisusedtodemangleC++andJavasymbols.
cpp
ThisistheCpreprocessorandisusedtoexpand#define,#include,andothersimilardirectives.Youseldomneedtousethisbyitself.
elfedit ThisisusedtoupdatetheELFheaderoftheELFfiles.
g++ThisistheGNUC++frontend,whichassumesthatsourcefilescontainC++code.
gccThisistheGNUCfrontend,whichassumesthatsourcefilescontainCcode.
gcov Thisisacodecoveragetool.gdb ThisistheGNUdebugger.gprof Thisisaprogramprofilingtool.ld ThisistheGNUlinker.nm Thislistssymbolsfromobjectfiles.objcopy Thisisusedtocopyandtranslateobjectfiles.objdump Thisisusedtodisplayinformationfromobjectfiles.
ranlib Thiscreatesormodifiesanindexinastaticlibrary,makingthelinkingstagefaster.
readelf ThisdisplaysinformationaboutfilesinELFobjectformat.size Thislistssectionsizesandthetotalsize.strings Thisdisplaysstringsofprintablecharactersinfiles.
strip
Thisisusedtostripanobjectfileofdebugsymboltables,thusmakingitsmaller.Typically,youwouldstripalltheexecutablecodethatisputontothetarget.
LookingatthecomponentsoftheClibraryTheClibraryisnotasinglelibraryfile.ItiscomposedoffourmainpartsthattogetherimplementthePOSIXAPI:
libc:ThemainClibrarythatcontainsthewell-knownPOSIXfunctionssuchasprintf,open,close,read,write,andsoonlibm:Containsmathsfunctionssuchascos,exp,andloglibpthread:ContainsallthePOSIXthreadfunctionswithnamesbeginningwithpthread_librt:Hasthereal-timeextensionstoPOSIX,includingsharedmemoryandasynchronousI/O
Thefirstone,libc,isalwayslinkedinbuttheothershavetobeexplicitlylinkedwiththe-loption.Theparameterto-listhelibrarynamewithlibstrippedoff.Forexample,aprogramthatcalculatesasinefunctionbycallingsin()wouldbelinkedwithlibmusing-lm:
$arm-cortex_a8-linux-gnueabihf-gccmyprog.c-omyprog-lm
Youcanverifywhichlibrarieshavebeenlinkedinthisoranyotherprogrambyusingthereadelfcommand:
$arm-cortex_a8-linux-gnueabihf-readelf-amyprog|grep"Sharedlibrary"
0x00000001(NEEDED)Sharedlibrary:[libm.so.6]
0x00000001(NEEDED)Sharedlibrary:[libc.so.6]
Sharedlibrariesneedaruntimelinker,whichyoucanexposeusing:
$arm-cortex_a8-linux-gnueabihf-readelf-amyprog|grep"programinterpreter"
[Requestingprograminterpreter:/lib/ld-linux-armhf.so.3]
ThisissousefulthatIhaveascriptfilenamedlist-libs,whichyouwillfindinthebookcodearchiveinMELP/list-libs.Itcontainsthefollowingcommands:
#!/bin/sh
${CROSS_COMPILE}readelf-a$1|grep"programinterpreter"
${CROSS_COMPILE}readelf-a$1|grep"Sharedlibrary"
Linkingwithlibraries–staticanddynamiclinkingAnyapplicationyouwriteforLinux,whetheritbeinCorC++,willbelinkedwiththeClibrarylibc.Thisissofundamentalthatyoudon'tevenhavetotellgccorg++todoitbecauseitalwayslinkslibc.Otherlibrariesthatyoumaywanttolinkwithhavetobeexplicitlynamedthroughthe-loption.
Thelibrarycodecanbelinkedintwodifferentways:statically,meaningthatallthelibraryfunctionsyourapplicationcallsandtheirdependenciesarepulledfromthelibraryarchiveandboundintoyourexecutable;anddynamically,meaningthatreferencestothelibraryfilesandfunctionsinthosefilesaregeneratedinthecodebuttheactuallinkingisdonedynamicallyatruntime.YouwillfindthecodefortheexamplesthatfollowinthebookcodearchiveinMELP/chapter_02/library.
StaticlibrariesStaticlinkingisusefulinafewcircumstances.Forexample,ifyouarebuildingasmallsystemwhichconsistsofonlyBusyBoxandsomescriptfiles,itissimplertolinkBusyBoxstaticallyandavoidhavingtocopytheruntimelibraryfilesandlinker.ItwillalsobesmallerbecauseyouonlylinkinthecodethatyourapplicationusesratherthansupplyingtheentireClibrary.Staticlinkingisalsousefulifyouneedtorunaprogrambeforethefilesystemthatholdstheruntimelibrariesisavailable.
Youtelltolinkallthelibrariesstaticallybyadding-statictothecommandline:
$arm-cortex_a8-linux-gnueabihf-gcc-statichelloworld.c-ohelloworld-static
Youwillnotethatthesizeofthebinaryincreasesdramatically:
$ls-l
-rwxrwxr-x1chrischris5884Mar509:56helloworld
-rwxrwxr-x1chrischris614692Mar510:27helloworld-static
Staticlinkingpullscodefromalibraryarchive,usuallynamedlib[name].a.Intheprecedingcase,itislibc.a,whichisin[sysroot]/usr/lib:
$exportSYSROOT=$(arm-cortex_a8-linux-gnueabihf-gcc-print-sysroot)
$cd$SYSROOT
$ls-lusr/lib/libc.a
-rw-r--r--1chrischris3457004Mar315:21usr/lib/libc.a
NotethatthesyntaxexportSYSROOT=$(arm-cortex_a8-linux-gnueabihf-gcc-print-sysroot)placesthepathtothesysrootintheshellvariable,SYSROOT,whichmakestheexamplealittleclearer.
Creatingastaticlibraryisassimpleascreatinganarchiveofobjectfilesusingthearcommand.IfIhavetwosourcefilesnamedtest1.candtest2.c,andIwanttocreateastaticlibrarynamedlibtest.a,thenIwoulddothefollowing:
$arm-cortex_a8-linux-gnueabihf-gcc-ctest1.c
$arm-cortex_a8-linux-gnueabihf-gcc-ctest2.c
$arm-cortex_a8-linux-gnueabihf-arrclibtest.atest1.otest2.o
$ls-l
total24
-rw-rw-r--1chrischris2392Oct909:28libtest.a
-rw-rw-r--1chrischris116Oct909:26test1.c
-rw-rw-r--1chrischris1080Oct909:27test1.o
-rw-rw-r--1chrischris121Oct909:26test2.c
-rw-rw-r--1chrischris1088Oct909:27test2.o
ThenIcouldlinklibtestintomyhelloworldprogram,using:
$arm-cortex_a8-linux-gnueabihf-gcchelloworld.c-ltest\
-L../libs-I../libs-ohelloworld
SharedlibrariesAmorecommonwaytodeploylibrariesisassharedobjectsthatarelinkedatruntime,whichmakesmoreefficientuseofstorageandsystemmemory,sinceonlyonecopyofthecodeneedstobeloaded.Italsomakesiteasytoupdatethelibraryfileswithouthavingtore-linkalltheprogramsthatusethem.
Theobjectcodeforasharedlibrarymustbeposition-independent,sothattheruntimelinkerisfreetolocateitinmemoryatthenextfreeaddress.Todothis,addthe-fPICparametertogcc,andthenlinkitusingthe-sharedoption:
$arm-cortex_a8-linux-gnueabihf-gcc-fPIC-ctest1.c
$arm-cortex_a8-linux-gnueabihf-gcc-fPIC-ctest2.c
$arm-cortex_a8-linux-gnueabihf-gcc-shared-olibtest.sotest1.otest2.o
Thiscreatesthesharedlibrary,libtest.so.Tolinkanapplicationwiththislibrary,youadd-ltest,exactlyasinthestaticcasementionedintheprecedingsection,butthistimethecodeisnotincludedintheexecutable.Instead,thereisareferencetothelibrarythattheruntimelinkerwillhavetoresolve:
$arm-cortex_a8-linux-gnueabihf-gcchelloworld.c-ltest\
-L../libs-I../libs-ohelloworld
$MELP/list-libshelloworld
[Requestingprograminterpreter:/lib/ld-linux-armhf.so.3]
0x00000001(NEEDED)Sharedlibrary:[libtest.so]
0x00000001(NEEDED)Sharedlibrary:[libc.so.6]
Theruntimelinkerforthisprogramis/lib/ld-linux-armhf.so.3,whichmustbepresentinthetarget'sfilesystem.Thelinkerwilllookforlibtest.sointhedefaultsearchpath:/liband/usr/lib.Ifyouwantittolookforlibrariesinotherdirectoriesaswell,youcanplaceacolon-separatedlistofpathsintheshellvariableLD_LIBRARY_PATH:
#exportLD_LIBRARY_PATH=/opt/lib:/opt/usr/lib
UnderstandingsharedlibraryversionnumbersOneofthebenefitsofsharedlibrariesisthattheycanbeupdatedindependentlyoftheprogramsthatusethem.Libraryupdatesareoftwotypes:thosethatfixbugsoraddnewfunctionsinabackwards-compatibleway,andthosethatbreakcompatibilitywithexistingapplications.GNU/Linuxhasaversioningschemetohandleboththesecases.
Eachlibraryhasareleaseversionandaninterfacenumber.Thereleaseversionissimplyastringthatisappendedtothelibraryname;forexample,theJPEGimagelibrarylibjpegiscurrentlyatrelease8.0.2andsothelibraryisnamedlibjpeg.so.8.0.2.Thereisasymboliclinknamedlibjpeg.sotolibjpeg.so.8.0.2,sothatwhenyoucompileaprogramwith–ljpeg,youlinkwiththecurrentversion.Ifyouinstallversion8.0.3,thelinkisupdatedandyouwilllinkwiththatoneinstead.
Nowsupposethatversion9.0.0.comesalongandthatbreaksthebackwardscompatibility.Thelinkfromlibjpeg.sonowpointstolibjpeg.so.9.0.0,sothatanynewprogramsarelinkedwiththenewversion,possiblythrowingcompileerrorswhentheinterfacetolibjpegchanges,whichthedevelopercanfix.Anyprogramsonthetargetthatarenotrecompiledaregoingtofailinsomeway,becausetheyarestillusingtheoldinterface.Thisiswhereanobjectknownasthesonamehelps.Thesonameencodestheinterfacenumberwhenthelibrarywasbuiltandisusedbytheruntimelinkerwhenitloadsthelibrary.Itisformattedas<libraryname>.so.<interfacenumber>.Forlibjpeg.so.8.0.2,thesonameislibjpeg.so.8:
$readelf-a/usr/lib/libjpeg.so.8.0.2|grepSONAME
0x000000000000000e(SONAME)Librarysoname:
[libjpeg.so.8]
Anyprogramcompiledwithitwillrequestlibjpeg.so.8atruntime,whichwillbeasymboliclinkonthetargettolibjpeg.so.8.0.2.Whenversion9.0.0oflibjpegisinstalled,itwillhaveasonameoflibjpeg.so.9,andsoitispossibletohavetwoincompatibleversionsofthesamelibraryinstalledonthesamesystem.Programsthatwerelinkedwithlibjpeg.so.8.*.*willloadlibjpeg.so.8,andthose
linkedwithlibjpeg.so.9.*.*willloadlibjpeg.so.9.
Thisiswhy,whenyoulookatthedirectorylistingof<sysroot>/usr/lib/libjpeg*,youfindthesefourfiles:
libjpeg.a:Thisisthelibraryarchiveusedforstaticlinkinglibjpeg.so->libjpeg.so.8.0.2:Thisisasymboliclink,usedfordynamiclinkinglibjpeg.so.8->libjpeg.so.8.0.2:Thisisasymboliclink,usedwhenloadingthelibraryatruntimelibjpeg.so.8.0.2:Thisistheactualsharedlibrary,usedatbothcompiletimeandruntime
Thefirsttwoareonlyneededonthehostcomputerforbuildingandthelasttwoareneededonthetargetatruntime.
TheartofcrosscompilingHavingaworkingcrosstoolchainisthestartingpointofajourney,nottheendofit.Atsomepoint,youwillwanttobegincrosscompilingthevarioustools,applications,andlibrariesthatyouneedonyourtarget.Manyofthemwillbeopensourcepackages—eachofwhichhasitsownmethodofcompilinganditsownpeculiarities.Therearesomecommonbuildsystems,including:
Puremakefiles,wherethetoolchainisusuallycontrolledbythemakevariableCROSS_COMPILETheGNUbuildsystemknownasAutotoolsCMake(https://cmake.org/)
IwillcoveronlythefirsttwoheresincethesearetheonesneededforevenabasicembeddedLinuxsystem.ForCMake,therearesomeexcellentresourcesontheCMakewebsitereferencedintheprecedingpoint.
SimplemakefilesSomeimportantpackagesareverysimpletocrosscompile,includingtheLinuxkernel,theU-Bootbootloader,andBusyBox.Foreachofthese,youonlyneedtoputthetoolchainprefixinthemakevariableCROSS_COMPILE,forexamplearm-cortex_a8-linux-gnueabi-.Notethetrailingdash-.
So,tocompileBusyBox,youwouldtype:
$makeCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-
Or,youcansetitasashellvariable:
$exportCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-
$make
InthecaseofU-BootandLinux,youalsohavetosetthemakevariableARCHtooneofthemachinearchitecturestheysupport,whichIwillcoverinChapter3,AllAboutBootloaders,andChapter4,ConfiguringandBuildingtheKernel.
AutotoolsThenameAutotoolsreferstoagroupoftoolsthatareusedasthebuildsysteminmanyopensourceprojects.Thecomponents,togetherwiththeappropriateprojectpages,are:
GNUAutoconf(https://www.gnu.org/software/autoconf/autoconf.html)GNUAutomake(https://www.gnu.org/savannah-checkouts/gnu/automake/)GNULibtool(https://www.gnu.org/software/libtool/libtool.html)Gnulib(https://www.gnu.org/software/gnulib/)
TheroleofAutotoolsistosmoothoverthedifferencesbetweenthemanydifferenttypesofsystemsthatthepackagemaybecompiledfor,accountingfordifferentversionsofcompilers,differentversionsoflibraries,differentlocationsofheaderfiles,anddependencieswithotherpackages.PackagesthatuseAutotoolscomewithascriptnamedconfigurethatchecksdependenciesandgeneratesmakefilesaccordingtowhatitfinds.Theconfigurescriptmayalsogiveyoutheopportunitytoenableordisablecertainfeatures.Youcanfindtheoptionsonofferbyrunning./configure--help.
Toconfigure,build,andinstallapackageforthenativeoperatingsystem,youwouldtypicallyrunthefollowingthreecommands:
$./configure
$make
$sudomakeinstall
Autotoolsisabletohandlecrossdevelopmentaswell.Youcaninfluencethebehavioroftheconfigurescriptbysettingtheseshellvariables:
CC:TheCcompilercommandCFLAGS:AdditionalCcompilerflagsLDFLAGS:Additionallinkerflags;forexample,ifyouhavelibrariesinanon-standarddirectory<libdir>,youwouldaddittothelibrarysearchpathbyadding-L<libdir>LIBS:Containsalistofadditionallibrariestopasstothelinker;forinstance,-lmforthemathlibrary
CPPFLAGS:ContainsC/C++preprocessorflags;forexample,youwouldadd-I<includedir>tosearchforheadersinanon-standarddirectory<includedir>CPP:TheCpreprocessortouse
SometimesitissufficienttosetonlytheCCvariable,asfollows:
$CC=arm-cortex_a8-linux-gnueabihf-gcc./configure
Atothertimes,thatwillresultinanerrorlikethis:
[...]
checkingwhetherwearecrosscompiling...configure:error:in'/home/chris/MELP/build/sqlite-autoconf-3081101':
configure:error:cannotrunCcompiledprograms.
Ifyoumeanttocrosscompile,use'--host'.
See'config.log'formoredetails
Thereasonforthefailureisthatconfigureoftentriestodiscoverthecapabilitiesofthetoolchainbycompilingsnippetsofcodeandrunningthemtoseewhathappens,whichcannotworkiftheprogramhasbeencrosscompiled.Nevertheless,thereisahintintheerrormessageonhowtosolvetheproblem.Autotoolsunderstandsthreedifferenttypesofmachinesthatmaybeinvolvedwhencompilingapackage:
Buildisthecomputerthatbuildsthepackage,whichdefaultstothecurrentmachine.Hostisthecomputertheprogramwillrunon;foranativecompile,thisisleftblankanditdefaultstobethesamecomputerasbuild.Whenyouarecrosscompiling,setittobethetupleofyourtoolchain.Targetisthecomputertheprogramwillgeneratecodefor;youwouldsetthiswhenbuildingacrosscompiler,forexample.
So,tocrosscompile,youjustneedtooverridethehost,asfollows:
$CC=arm-cortex_a8-linux-gnueabihf-gcc\
./configure--host=arm-cortex_a8-linux-gnueabihf
Onefinalthingtonoteisthatthedefaultinstalldirectoryis<sysroot>/usr/local/*.Youwouldusuallyinstallitin<sysroot>/usr/*,sothattheheaderfilesandlibrarieswouldbepickedupfromtheirdefaultlocations.ThecompletecommandtoconfigureatypicalAutotoolspackageisasfollows:
$CC=arm-cortex_a8-linux-gnueabihf-gcc\
./configure--host=arm-cortex_a8-linux-gnueabihf--prefix=/usr
Anexample:SQLiteTheSQLitelibraryimplementsasimplerelationaldatabaseandisquitepopularonembeddeddevices.YoubeginbygettingacopyofSQLite:
$wgethttp://www.sqlite.org/2015/sqlite-autoconf-3081101.tar.gz
$tarxfsqlite-autoconf-3081101.tar.gz
$cdsqlite-autoconf-3081101
Next,runtheconfigurescript:
$CC=arm-cortex_a8-linux-gnueabihf-gcc\
./configure--host=arm-cortex_a8-linux-gnueabihf--prefix=/usr
Thatseemstowork!Ifithadfailed,therewouldbeerrormessagesprintedtotheTerminalandrecordedinconfig.log.Notethatseveralmakefileshavebeencreated,sonowyoucanbuildit:
$make
Finally,youinstallitintothetoolchaindirectorybysettingthemakevariableDESTDIR.Ifyoudon't,itwilltrytoinstallitintothehostcomputer's/usrdirectory,whichisnotwhatyouwant:
$makeDESTDIR=$(arm-cortex_a8-linux-gnueabihf-gcc-print-sysroot)install
Youmayfindthatthefinalcommandfailswithafilepermissionserror.Acrosstool-NGtoolchainisread-onlybydefault,whichiswhyitisusefultosetCT_INSTALL_DIR_ROtoywhenbuildingit.Anothercommonproblemisthatthetoolchainisinstalledinasystemdirectory,suchas/optor/usr/local,inwhichcaseyouwillneedrootpermissionswhenrunningtheinstall.
Afterinstalling,youshouldfindthatvariousfileshavebeenaddedtoyourtoolchain:
<sysroot>/usr/bin:sqlite3:Thisisacommand-lineinterfaceforSQLitethatyoucaninstallandrunonthetarget<sysroot>/usr/lib:libsqlite3.so.0.8.6,libsqlite3.so.0,libsqlite3.so,libsqlite3.la,
libsqlite3.a:Thesearethesharedandstaticlibraries<sysroot>/usr/lib/pkgconfig:sqlite3.pc:Thisisthepackageconfigurationfile,asdescribedinthefollowingsection<sysroot>/usr/lib/include:sqlite3.h,sqlite3ext.h:Thesearetheheaderfiles<sysroot>/usr/share/man/man1:sqlite3.1:Thisisthemanualpage
Nowyoucancompileprogramsthatusesqlite3byadding-lsqlite3atthelinkstage:
$arm-cortex_a8-linux-gnueabihf-gcc-lsqlite3sqlite-test.c-osqlite-test
Here,sqlite-test.cisahypotheticalprogramthatcallsSQLitefunctions.Sincesqlite3hasbeeninstalledintothesysroot,thecompilerwillfindtheheaderandlibraryfileswithoutanyproblem.Iftheyhadbeeninstalledelsewhere,youwouldhavehadtoadd-L<libdir>and-I<includedir>.
Naturally,therewillberuntimedependenciesaswell,andyouwillhavetoinstalltheappropriatefilesintothetargetdirectoryasdescribedinChapter5,BuildingaRootFilesystem.
PackageconfigurationTrackingpackagedependenciesisquitecomplex.Thepackageconfigurationutilitypkg-config(https://www.freedesktop.org/wiki/Software/pkg-config/)helpstrackwhichpackagesareinstalledandwhichcompileflagseachneedsbykeepingadatabaseofAutotoolspackagesin[sysroot]/usr/lib/pkgconfig.Forinstance,theoneforSQLite3isnamedsqlite3.pcandcontainsessentialinformationneededbyotherpackagesthatneedtomakeuseofit:
$cat$(arm-cortex_a8-linux-gnueabihf-gcc-print-sysroot)/usr/lib/pkgconfig/sqlite3.pc
#PackageInformationforpkg-config
prefix=/usr
exec_prefix=${prefix}
libdir=${exec_prefix}/lib
includedir=${prefix}/include
Name:SQLite
Description:SQLdatabaseengine
Version:3.8.11.1
Libs:-L${libdir}-lsqlite3
Libs.private:-ldl-lpthread
Cflags:-I${includedir}
Youcanusepkg-configtoextractinformationinaformthatyoucanfeedstraighttogcc.Inthecaseofalibrarylikelibsqlite3,youwanttoknowthelibraryname(--libs)andanyspecialCflags(--cflags):
$pkg-configsqlite3--libs--cflags
Packagesqlite3wasnotfoundinthepkg-configsearchpath.
Perhapsyoushouldaddthedirectorycontaining`sqlite3.pc'
tothePKG_CONFIG_PATHenvironmentvariable
Nopackage'sqlite3'found
Oops!Thatfailedbecauseitwaslookinginthehost'ssysrootandthedevelopmentpackageforlibsqlite3hasnotbeeninstalledonthehost.YouneedtopointitatthesysrootofthetargettoolchainbysettingtheshellvariablePKG_CONFIG_LIBDIR:
$exportPKG_CONFIG_LIBDIR=$(arm-cortex_a8-linux-gnueabihf-gcc\
-print-sysroot)/usr/lib/pkgconfig
$pkg-configsqlite3--libs--cflags-lsqlite3
Nowtheoutputis-lsqlite3.Inthiscase,youknewthatalready,butgenerallyyou
wouldn't,sothisisavaluabletechnique.Thefinalcommandstocompilewouldbe:
$exportPKG_CONFIG_LIBDIR=$(arm-cortex_a8-linux-gnueabihf-gcc\
-print-sysroot)/usr/lib/pkgconfig
$arm-cortex_a8-linux-gnueabihf-gcc$(pkg-configsqlite3--cflags--libs)\
sqlite-test.c-osqlite-test
ProblemswithcrosscompilingThesqlite3isawell-behavedpackageandcrosscompilesnicely,butnotallpackagesarethesame.Typicalpainpointsinclude:
Home-grownbuildsystems;zlib,forexample,hasaconfigurescript,butitdoesnotbehaveliketheAutotoolsconfiguredescribedintheprevioussectionConfigurescriptsthatreadpkg-configinformation,headers,andotherfilesfromthehost,disregardingthe--hostoverrideScriptsthatinsistontryingtoruncrosscompiledcode
Eachcaserequirescarefulanalysisoftheerrorandadditionalparameterstotheconfigurescripttoprovidethecorrectinformation,orpatchestothecodetoavoidtheproblemaltogether.Bearinmindthatonepackagemayhavemanydependencies,especiallywithprogramsthathaveagraphicalinterfaceusingGTKorQT,orthathandlemultimediacontent.Asanexample,mplayer,whichisapopulartoolforplayingmultimediacontent,hasdependenciesonover100libraries.Itwouldtakeweeksofefforttobuildthemall.
Therefore,Iwouldnotrecommendmanuallycrosscompilingcomponentsforthetargetinthisway,exceptwhenthereisnoalternativeorthenumberofpackagestobuildissmall.AmuchbetterapproachistouseabuildtoolsuchasBuildrootortheYoctoProject,oravoidtheproblemaltogetherbysettingupanativebuildenvironmentforyourtargetarchitecture.NowyoucanseewhydistributionslikeDebianarealwayscompilednatively.
SummaryThetoolchainisalwaysyourstartingpoint;everythingthatfollowsfromthatisdependentonhavingaworking,reliabletoolchain.
Mostembeddedbuildenvironmentsarebasedonacrossdevelopmenttoolchain,whichcreatesaclearseparationbetweenapowerfulhostcomputerbuildingthecodeandatargetcomputeronwhichitruns.ThetoolchainitselfconsistsoftheGNUbinutils,aCcompilerfromtheGNUcompilercollection—andquitelikelytheC++compileraswell—plusoneoftheClibrariesIhavedescribed.Usually,theGNUdebugger,GDB,willbegeneratedatthispoint,whichIdescribeinChapter14,DebuggingwithGDB.Also,keepawatchoutfortheClangcompiler,asitwilldevelopoverthenextfewyears.
Youmaystartwithnothingbutatoolchain—perhapsbuiltusingcrosstool-NGordownloadedfromLinaro—anduseittocompileallthepackagesthatyouneedonyourtarget,acceptingtheamountofhardworkthiswillentail.Oryoumayobtainthetoolchainaspartofadistributionwhichincludesarangeofpackages.AdistributioncanbegeneratedfromsourcecodeusingabuildsystemsuchasBuildrootortheYoctoProject,oritcanbeabinarydistributionfromathirdparty,maybeacommercialenterpriselikeMentorGraphics,oranopensourceprojectsuchastheDenxELDK.Bewareoftoolchainsordistributionsthatareofferedtoyouforfreeaspartofahardwarepackage;theyareoftenpoorlyconfiguredandnotmaintained.Inanycase,youshouldmakeyourchoiceaccordingtoyoursituation,andthenbeconsistentinitsusethroughouttheproject.
Onceyouhaveatoolchain,youcanuseittobuildtheothercomponentsofyourembeddedLinuxsystem.Inthenextchapter,youwilllearnaboutthebootloader,whichbringsyourdevicetolifeandbeginsthebootprocess.
AllAboutBootloadersThebootloaderisthesecondelementofembeddedLinux.Itisthepartthatstartsthesystemupandloadstheoperatingsystemkernel.Inthischapter,Iwilllookattheroleofthebootloaderand,inparticular,howitpassescontrolfromitselftothekernelusingadatastructurecalledadevicetree,alsoknownasaflatteneddevicetreeorFDT.Iwillcoverthebasicsofdevicetrees,sothatyouwillbeabletofollowtheconnectionsdescribedinadevicetreeandrelateittorealhardware.
Iwilllookatthepopularopensourcebootloader,U-Boot,andshowyouhowtouseittobootatargetdevice,andalsohowtocustomizeittorunonanewdevice,usingtheBeagleBoneBlackasanexample.Finally,IwilltakeaquicklookatBarebox,abootloaderthatsharesitspastwithU-Boot,butwhichhas,arguably,acleanerdesign.
Inthischapter,wewillcoverthefollowingtopics:
Whatdoesabootloaderdo?Thebootsequence.BootingwithUEFIfirmware.Movingfrombootloadertokernel.Introducingdevicetrees.Choosingabootloader.U-Boot.Barebox.
Whatdoesabootloaderdo?InanembeddedLinuxsystem,thebootloaderhastwomainjobs:toinitializethesystemtoabasiclevelandtoloadthekernel.Infact,thefirstjobissomewhatsubsidiarytothesecond,inthatitisonlynecessarytogetasmuchofthesystemworkingasisneededtoloadthekernel.
Whenthefirstlinesofthebootloadercodeareexecuted,followingapower-onorareset,thesystemisinaveryminimalstate.TheDRAMcontrollerwouldnothavebeensetup,andsothemainmemorywouldnotbeaccessible.Likewise,otherinterfaceswouldnothavebeenconfigured,sostorageaccessedviaNANDflashcontrollers,MMCcontrollers,andsoon,wouldalsonotbeusable.Typically,theonlyresourcesoperationalatthebeginningareasingleCPUcoreandsomeon-chipstaticmemory.Asaresult,systembootstrapconsistsofseveralphasesofcode,eachbringingmoreofthesystemintooperation.ThefinalactofthebootloaderistoloadthekernelintoRAMandcreateanexecutionenvironmentforit.Thedetailsoftheinterfacebetweenthebootloaderandthekernelarearchitecture-specific,butineachcaseithastodotwothings.First,bootloaderhastopassapointertoastructurecontaininginformationaboutthehardwareconfiguration,andsecondithastopassapointertothekernelcommandline.ThekernelcommandlineisatextstringthatcontrolsthebehaviorofLinux.Oncethekernelhasbegunexecuting,thebootloaderisnolongerneededandallthememoryitwasusingcanbereclaimed.
Asubsidiaryjobofthebootloaderistoprovideamaintenancemodeforupdatingbootconfigurations,loadingnewbootimagesintomemory,and,maybe,runningdiagnostics.Thisisusuallycontrolledbyasimplecommand-lineuserinterface,commonlyoveraserialinterface.
ThebootsequenceInsimplertimes,someyearsago,itwasonlynecessarytoplacethebootloaderinnon-volatilememoryattheresetvectoroftheprocessor.NORflashmemorywascommonatthattimeand,sinceitcanbemappeddirectlyintotheaddressspace,itwastheidealmethodofstorage.Thefollowingdiagramshowssuchaconfiguration,withtheResetvectorat0xfffffffcatthetopendofanareaofflashmemory.Thebootloaderislinkedsothatthereisajumpinstructionatthatlocationthatpointstothestartofthebootloadercode:
Fromthatpoint,thebootloadercoderunninginNORflashmemorycaninitializetheDRAMcontroller,sothatthemainmemory,theDRAM,becomesavailableandthenitcopiesitselfintotheDRAM.Oncefullyoperational,thebootloadercanloadthekernelfromflashmemoryintoDRAMandtransfercontroltoit.
However,onceyoumoveawayfromasimplelinearlyaddressablestoragemediumlikeNORflash,thebootsequencebecomesacomplex,multi-stageprocedure.ThedetailsareveryspecifictoeachSoC,buttheygenerallyfolloweachofthefollowingphases.
Phase1–ROMcodeIntheabsenceofreliableexternalmemory,thecodethatrunsimmediatelyafteraresetorpower-onhastobestoredon-chipintheSoC;thisisknownasROMcode.Itisloadedintothechipwhenitismanufactured,andhencetheROMcodeisproprietaryandcannotbereplacedbyanopensourceequivalent.Usually,itdoesnotincludecodetoinitializethememorycontroller,sinceDRAMconfigurationsarehighlydevice-specific,andsoitcanonlyuseStaticRandomAccessMemory(SRAM),whichdoesnotrequireamemorycontroller.
MostembeddedSoCdesignshaveasmallamountofSRAMon-chip,varyinginsizefromaslittleas4KBtoseveralhundredKB:
TheROMcodeiscapableofloadingasmallchunkofcodefromoneofseveralpre-programmedlocationsintotheSRAM.Asanexample,TIOMAPandSitarachipstrytoloadcodefromthefirstfewpagesofNANDflashmemory,orfromflashmemoryconnectedthroughaSerialPeripheralInterface(SPI),orfromthefirstsectorsofanMMCdevice(whichcouldbeaneMMCchiporanSDcard),orfromafilenamedMLOonthefirstpartitionofanMMCdevice.Ifreadingfromallofthesememorydevicesfails,thenittriesreadingabytestreamfromEthernet,USB,orUART;thelatterisprovidedmainlyasameansofloadingcodeintoflashmemoryduringproduction,ratherthanforuseinnormaloperation.MostembeddedSoCshaveaROMcodethatworksinasimilarway.InSoCswheretheSRAMisnotlargeenoughtoloadafullbootloaderlikeU-Boot,therehastobeanintermediateloadercalledthesecondaryprogramloader,orSPL.
AttheendoftheROMcodephase,theSPLispresentintheSRAMandtheROMcodejumpstothebeginningofthatcode.
Phase2–secondaryprogramloaderTheSPLmustsetupthememorycontrollerandotheressentialpartsofthesystempreparatorytoloadingtheTertiaryProgramLoader(TPL)intoDRAM.ThefunctionalityoftheSPLislimitedbythesizeoftheSRAM.Itcanreadaprogramfromalistofstoragedevices,ascantheROMcode,onceagainusingpre-programmedoffsetsfromthestartofaflashdevice.IftheSPLhasfilesystemdriversbuiltin,itcanreadwellknownfilenames,suchasu-boot.img,fromadiskpartition.TheSPLusuallydoesn'tallowforanyuserinteraction,butitmayprintversioninformationandprogressmessages,whichyoucanseeontheconsole.Thefollowingdiagramexplainsthephase2architecture:
TheSPLmaybeopensource,asisthecasewiththeTIx-loaderandAtmelAT91Bootstrap,butitisquitecommonforittocontainproprietarycodethatissuppliedbythemanufacturerasabinaryblob.
Attheendofthesecondphase,theTPLispresentinDRAM,andtheSPLcanmakeajumptothatarea.
Phase3–TPLNow,atlast,wearerunningafullbootloader,suchasU-BootorBareBox.Usually,thereisasimplecommand-lineuserinterfacethatletsyouperformmaintenancetasks,suchasloadingnewbootandkernelimagesintoflashstorage,andloadingandbootingakernel,andthereisawaytoloadthekernelautomaticallywithoutuserintervention.
Thefollowingdiagramexplainsthephase3architecture:
Attheendofthethirdphase,thereisakernelinmemory,waitingtobestarted.
Embeddedbootloadersusuallydisappearfrommemoryoncethekernelisrunning,andperformnofurtherpartintheoperationofthesystem.
BootingwithUEFIfirmwareMostembeddedx86/x86_64designs,andsomeARMdesigns,havefirmwarebasedontheUniversalExtensibleFirmwareInterface(UEFI)standard.YoucantakealookattheUEFIwebsiteathttp://www.uefi.org/formoreinformation.Thebootsequenceisfundamentallythesameasthatdescribedintheprecedingsection:
Phase1:Theprocessorloadstheplatforminitializationfirmwarefromflashmemory.Insomedesigns,itisloadeddirectlyfromNORflashmemory,whileinothers,thereisROMcodeon-chipwhichloadsthefirmwarefromSPIflashmemoryintosomeon-chipstaticRAM.Phase2:TheplatforminitializationfirmwareperformstheroleofSPL.ItinitializestheDRAMcontrollerandothersysteminterfaces,soastobeabletoloadanEFIbootmanagerfromtheEFISystemPartition(ESP)onalocaldisk,orfromanetworkserverviaPXEboot.TheESPmustbeformattedusingFAT16orFAT32formatanditshouldhavethewell-knownGUIDvalueofC12A7328-F81F-11D2-BA4B-00A0C93EC93B.Thepathnameofthebootmanagercodemustfollowthenamingconvention<efi_system_partition>/boot/boot<machine_type_short_name>.efi.Forexample,thefilepathtotheloaderonanx86_64systemwouldbe/efi/boot/bootx64.efi.
Phase3:TheUEFIbootmanageristhetertiaryprogramloader.TheTPLinthiscasehastobeabootloaderthatiscapableofloadingaLinuxkernelandanoptionalRAMdiskintomemory.Commonchoicesare:
systemd-boot:Thisusedtobecalledgummiboot.ItisasimpleUEFI-compatiblebootloader,licensedunderLGPLv2.1.Thewebsiteishttps://www.freedesktop.org/wiki/Software/systemd/systemd-boot/.Tummiboot:Thisisthegummibootwithtrustedbootsupport(Intel'sTrustedExecutionTechnology(TEX)).
MovingfrombootloadertokernelWhenthebootloaderpassescontroltothekernelithastopasssomebasicinformation,whichmayincludesomeofthefollowing:
Themachinenumber,whichisusedonPowerPC,andARMplatformswithoutsupportforadevicetree,toidentifythetypeoftheSoCBasicdetailsofthehardwaredetectedsofar,includingatleastthesizeandlocationofthephysicalRAM,andtheCPUclockspeedThekernelcommandlineOptionally,thelocationandsizeofadevicetreebinaryOptionally,thelocationandsizeofaninitialRAMdisk,calledtheinitialRAMfilesystem(initramfs)
ThekernelcommandlineisaplainASCIIstringwhichcontrolsthebehaviorofLinuxbygiving,forexample,thenameofthedevicethatcontainstherootfilesystem.Iwilllookatthedetailsofthisinthenextchapter.ItiscommontoprovidetherootfilesystemasaRAMdisk,inwhichcaseitistheresponsibilityofthebootloadertoloadtheRAMdiskimageintomemory.IwillcoverthewayyoucreateinitialRAMdisksinChapter5,BuildingaRootFilesystem.
Thewaythisinformationispassedisdependentonthearchitectureandhaschangedinrecentyears.Forinstance,withPowerPC,thebootloadersimplyusedtopassapointertoaboardinformationstructure,whereas,withARM,itpassedapointertoalistofAtags.ThereisagooddescriptionoftheformatofAtagsinthekernelsourceinDocumentation/arm/Booting.
Inbothcases,theamountofinformationpassedwasverylimited,leavingthebulkofittobediscoveredatruntimeorhard-codedintothekernelasplatformdata.Thewidespreaduseofplatformdatameantthateachboardhadtohaveakernelconfiguredandmodifiedforthatplatform.Abetterwaywasneeded,andthatwayisthedevicetree.IntheARMworld,themoveawayfromAtagsbeganinearnestinFebruary2013withthereleaseofLinux3.8.Today,almostallARMsystemsusedevicetreetogatherinformationaboutthespecificsofthehardwareplatform,allowingasinglekernelbinarytorunonawiderangeof
thoseplatforms.
IntroducingdevicetreesIfyouareworkingwithARMorPowerPCSoCs,youarealmostcertainlygoingtoencounterdevicetreesatsomepoint.Thissectionaimstogiveyouaquickoverviewofwhattheyareandhowtheywork,buttherearemanydetailsthatarenotdiscussed.
Adevicetreeisaflexiblewaytodefinethehardwarecomponentsofacomputersystem.Usually,thedevicetreeisloadedbythebootloaderandpassedtothekernel,althoughitispossibletobundlethedevicetreewiththekernelimageitselftocaterforbootloadersthatarenotcapableofloadingthemseparately.
TheformatisderivedfromaSunMicrosystemsbootloaderknownasOpenBoot,whichwasformalizedastheOpenFirmwarespecification,whichisIEEEstandardIEEE1275-1994.ItwasusedinPowerPC-basedMacintoshcomputersandsowasalogicalchoiceforthePowerPCLinuxport.Sincethen,ithasbeenadoptedonalargescalebythemanyARMLinuximplementationsand,toalesserextent,byMIPS,MicroBlaze,ARC,andotherarchitectures.
Iwouldrecommendvisitinghttps://www.devicetree.org/formoreinformation.
DevicetreebasicsTheLinuxkernelcontainsalargenumberofdevicetreesourcefilesinarch/$ARCH/boot/dts,andthisisagoodstartingpointforlearningaboutdevicetrees.TherearealsoasmallernumberofsourcesintheU-bootsourcecodeinarch/$ARCH/dts.Ifyouacquiredyourhardwarefromathirdparty,thedtsfileformspartoftheboardsupportpackageandyoushouldexpecttoreceiveonealongwiththeothersourcefiles.
Thedevicetreerepresentsacomputersystemasacollectionofcomponentsjoinedtogetherinahierarchy,likeatree.Thedevicetreebeginswitharootnode,representedbyaforwardslash,/,whichcontainssubsequentnodesrepresentingthehardwareofthesystem.Eachnodehasanameandcontainsanumberofpropertiesintheformname="value".Hereisasimpleexample:
/dts-v1/;
/{
model="TIAM335xBeagleBone";
compatible="ti,am33xx";
#address-cells=<1>;
#size-cells=<1>;
cpus{
#address-cells=<1>;
#size-cells=<0>;
cpu@0{
compatible="arm,cortex-a8";
device_type="cpu";
reg=<0>;
};
};
memory@0x80000000{
device_type="memory";
reg=<0x800000000x20000000>;/*512MB*/
};
};
Herewehavearootnodewhichcontainsacpusnodeandamemorynode.ThecpusnodecontainsasingleCPUnodenamedcpu@0.Itisacommonconventionthatthenamesofnodesincludean@followedbyanaddressthatdistinguishesthisnodefromothernodesofthesametype.
BoththerootandCPUnodeshaveacompatibleproperty.TheLinuxkernelusesthispropertytofindamatchingdevicedriverbycomparingitwiththestringsexportedbyeachdevicedriverinastructureof_device_id(moreonthisinChapter9,
InterfacingwithDeviceDrivers).
Itisaconventionthatthevalueiscomposedofamanufacturernameandacomponentname,toreduceconfusionbetweensimilardevicesmadebydifferentmanufacturers;hence,ti,am33xxandarm,cortex-a8.Itisalsoquitecommontohavemorethanonevalueforthecompatiblepropertywherethereismorethanonedriverthatcanhandlethisdevice.Theyarelistedwiththemostsuitablefirst.
TheCPUnodeandthememorynodehaveadevice_typepropertywhichdescribestheclassofdevice.Thenodenameisoftenderivedfromdevice_type.
TheregpropertyThememoryandcpunodeshavearegproperty,whichreferstoarangeofunitsinaregisterspace.Aregpropertyconsistsoftwovaluesrepresentingthestartaddressandthesize(length)oftherange.Botharewrittenaszeroormore32-bitintegers,calledcells.Hence,thememorynodereferstoasinglebankofmemorythatbeginsat0x80000000andis0x20000000byteslong.
Understandingregpropertiesbecomesmorecomplexwhentheaddressorsizevaluescannotberepresentedin32bits.Forexample,onadevicewith64-bitaddressing,youneedtwocellsforeach:
/{
#address-cells=<2>;
#size-cells=<2>;
memory@80000000{
device_type="memory";
reg=<0x000000000x8000000000x80000000>;
};
};
Theinformationaboutthenumberofcellsrequiredisheldinthe#address-cellsand#size_cellspropertiesinanancestornode.Inotherwords,tounderstandaregproperty,youhavetolookbackwardsdownthenodehierarchyuntilyoufind#address-cellsand#size_cells.Iftherearenone,thedefaultvaluesare1foreach–butitisbadpracticefordevicetreewriterstodependonfall-backs.
Now,let'sreturntothecpuandcpusnodes.CPUshaveaddressesaswell;inaquadcoredevice,theymightbeaddressedas0,1,2,and3.Thatcanbethoughtofasaone-dimensionalarraywithoutanydepth,sothesizeiszero.Therefore,youcanseethatwehave#address-cells=<1>and#size-cells=<0>inthecpusnode,andinthechildnode,cpu@0,weassignasinglevaluetotheregproperty,reg=<0>.
LabelsandinterruptsThestructureofthedevicetreedescribedsofarassumesthatthereisasinglehierarchyofcomponents,whereasinfactthereareseveral.Aswellastheobviousdataconnectionbetweenacomponentandotherpartsofthesystem,itmightalsobeconnectedtoaninterruptcontroller,toaclocksource,andtoavoltageregulator.Toexpresstheseconnections,wecanaddalabeltoanodeandreferencethelabelfromothernodes.Theselabelsaresometimesreferredtoasphandles,becausewhenthedevicetreeiscompiled,nodeswithareferencefromanothernodeareassignedauniquenumericalvalueinapropertycalledphandle.Youcanseethemifyoudecompilethedevicetreebinary.
TakeasanexampleasystemcontaininganLCDcontrollerwhichcangenerateinterruptsandaninterrupt-controller:
/dts-v1/;
{
intc:interrupt-controller@48200000{
compatible="ti,am33xx-intc";
interrupt-controller;
#interrupt-cells=<1>;
reg=<0x482000000x1000>;
};
lcdc:lcdc@4830e000{
compatible="ti,am33xx-tilcdc";
reg=<0x4830e0000x1000>;
interrupt-parent=<&intc>;
interrupts=<36>;
ti,hwmods="lcdc";
status="disabled";
};
};
Herewehavenodeinterrupt-controller@48200000withthelabelintc.Theinterrupt-controllerpropertyidentifiesitasaninterruptcontroller.Likeallinterruptcontrollers,ithasan#interrupt-cellsproperty,whichtellsushowmanycellsareneededtorepresentaninterruptsource.Inthiscase,thereisonlyonewhichrepresentstheinterruptrequest(IRQ)number.Otherinterruptcontrollersmayuseadditionalcellstocharacterizetheinterrupt,forexampletoindicatewhetheritisedgeorleveltriggered.Thenumberofinterruptcellsandtheirmeaningsisdescribedinthebindingsforeachinterruptcontroller.ThedevicetreebindingscanbefoundintheLinuxkernelsource,inthedirectory
Documentation/devicetree/bindings/.
Lookingatthelcdc@4830e000node,ithasaninterrupt-parentproperty,whichreferencestheinterruptcontrolleritisconnectedto,usingthelabel.Italsohasaninterruptsproperty,36inthiscase.Notethatthisnodehasitsownlabel,lcdc,whichisusedelsewhere:anynodecanhavealabel.
DevicetreeincludefilesAlotofhardwareiscommonbetweenSoCsofthesamefamilyandbetweenboardsusingthesameSoC.Thisisreflectedinthedevicetreebysplittingoutcommonsectionsintoincludefiles,usuallywiththeextension.dtsi.TheOpenFirmwarestandarddefines/include/asthemechanismtobeused,asinthissnippetfromvexpress-v2p-ca9.dts:
/include/"vexpress-v2m.dtsi"
Lookthroughthe.dtsfilesinthekernel,though,andyouwillfindanalternativeincludestatementthatisborrowedfromC,forexampleinam335x-boneblack.dts:
#include"am33xx.dtsi"
#include"am335x-bone-common.dtsi"
Hereisanotherexamplefromam33xx.dtsi:
#include<dt-bindings/gpio/gpio.h>
#include<dt-bindings/pinctrl/am33xx.h>
Lastly,include/dt-bindings/pinctrl/am33xx.hcontainsnormalCmacros:
#definePULL_DISABLE(1<<3)
#defineINPUT_EN(1<<5)
#defineSLEWCTRL_SLOW(1<<6)
#defineSLEWCTRL_FAST0
AllofthisisresolvedifthedevicetreesourcesarebuiltusingtheKbuildsystem,whichfirstrunsthemthroughtheCpre-processor,CPP,wherethe#includeand#definestatementsareprocessedintotextthatissuitableforthedevicetreecompiler.Themotivationisillustratedinthepreviousexample;itmeansthatthedevicetreesourcescanusethesamedefinitionsofconstantsasthekernelcode.
Whenweincludefiles,usingeithersyntax,thenodesareoverlaidontopofoneanothertocreateacompositetreeinwhichtheouterlayersextendormodifytheinnerones.Forexample,am33xx.dtsi,whichisgeneraltoallam33xxSoCs,definesthefirstMMCcontrollerinterfacelikethis:
mmc1:mmc@48060000{
compatible="ti,omap4-hsmmc";
ti,hwmods="mmc1";
ti,dual-volt;
ti,needs-special-reset;
ti,needs-special-hs-handling;
dmas=<&edma24&edma25>;
dma-names="tx","rx";
interrupts=<64>;
interrupt-parent=<&intc>;
reg=<0x480600000x1000>;
status="disabled";
};
Notethatthestatusisdisabled,meaningthatnodevicedrivershouldbeboundtoit,andalsothatithasthelabelmmc1.
BoththeBeagleBoneandtheBeagleBoneBlackhaveamicroSDcardinterfaceattachedtommc1,henceinam335x-bone-common.dtsi,thesamenodeisreferencedbyitslabel,&mmc1:
&mmc1{
status="okay";
bus-width=<0x4>;
pinctrl-names="default";
pinctrl-0=<&mmc1_pins>;
cd-gpios=<&gpio06GPIO_ACTIVE_HIGH>;
cd-inverted;
};
Thestatuspropertyissettookay,whichcausesthemmcdevicedrivertobindwiththisinterfaceatruntimeonbothvariantsoftheBeagleBone.Also,alabelisaddedtothepincontrolconfiguration,mmc1_pins.Alas,thereisnotsufficientspaceheretodescribepincontrolandpinmultiplexing.YouwillfindsomeinformationintheLinuxkernelsourceindirectorydevicetree/bindings/pinctrl.
However,interfacemmc1isconnectedtoadifferentvoltageregulatorontheBeagleBoneBlack.Thisisexpressedinam335x-boneblack.dts,whereyouwillseeanotherreferencetommc1,whichassociatesitwiththevoltageregulatorvialabelvmmcsd_fixed:
&mmc1{
vmmc-supply=<&vmmcsd_fixed>;
};
So,layeringdevicetreesourcefileslikethisgivesflexibilityandreducestheneedforduplicatedcode.
CompilingadevicetreeThebootloaderandkernelrequireabinaryrepresentationofthedevicetree,soithastobecompiledusingthedevicetreecompiler,dtc.Theresultisafileendingwith.dtb,whichisreferredtoasadevicetreebinaryoradevicetreeblob.
ThereisacopyofdtcintheLinuxsource,inscripts/dtc/dtc,anditisalsoavailableasapackageonmanyLinuxdistributions.Youcanuseittocompileasimpledevicetree(onethatdoesnotuse#include)likethis:
$dtcsimpledts-1.dts-osimpledts-1.dtb
DTC:dts->dtsonfile"simpledts-1.dts"
Bewaryofthefactthatdtcdoesnotgivehelpfulerrormessagesanditmakesnochecksotherthanonthebasicsyntaxofthelanguage,whichmeansthatdebuggingatypingerrorinasourcefilecanbealengthybusiness.
Tobuildmorecomplexexamples,youwillhavetousethekernelKbuild,asshowninthenextchapter.
ChoosingabootloaderBootloaderscomeinallshapesandsizes.Thekindofcharacteristicsyouwantfromabootloaderarethattheybesimpleandcustomizablewithlotsofsampleconfigurationsforcommondevelopmentboardsanddevices.Thefollowingtableshowsanumberofbootloadersthatareingeneraluse:
Name Mainarchitecturessupported
DasU-Boot ARC,ARM,Blackfin,Microblaze,MIPS,Nios2,OpenRiec,PowerPC,SH
Barebox ARM,Blackfin,MIPS,Nios2,PowerPC
GRUB2 X86,X86_64
LittleKernel ARM
RedBoot ARM,MIPS,PowerPC,SH
CFE BroadcomMIPS
YAMON MIPS
WearegoingtofocusonU-Bootbecauseitsupportsagoodnumberofprocessorarchitecturesandalargenumberofindividualboardsanddevices.Ithasbeenaroundforalongtimeandhasagoodcommunityforsupport.
ItmaybethatyoureceivedabootloaderalongwithyourSoCorboard.Asalways,takeagoodlookatwhatyouhaveandaskquestionsaboutwhereyoucangetthesourcecodefrom,whattheupdatepolicyis,howtheywillsupportyouifyouwanttomakechanges,andsoon.Youmaywanttoconsiderabandoningthevendor-suppliedloaderandusingthecurrentversionofanopensourcebootloaderinstead.
U-BootU-Boot,ortogiveitsfullname,DasU-Boot,beganlifeasanopensourcebootloaderforembeddedPowerPCboards.Then,itwasportedtoARM-basedboardsandlatertootherarchitectures,includingMIPSandSH.ItishostedandmaintainedbyDenxSoftwareEngineering.Thereisplentyofinformationavailable,andagoodplacetostartishttp://www.denx.de/wiki/[email protected].
BuildingU-BootBeginbygettingthesourcecode.Aswithmostprojects,therecommendedwayistoclonethe.gitarchiveandcheckoutthetagyouintendtouse,which,inthiscase,istheversionthatwascurrentatthetimeofwriting:
$gitclonegit://git.denx.de/u-boot.git
$cdu-boot
$gitcheckoutv2017.01
Alternatively,youcangetatarballfromftp://ftp.denx.de/pub/u-boot.
Therearemorethan1,000configurationfilesforcommondevelopmentboardsanddevicesintheconfigs/directory.Inmostcases,youcanmakeagoodguessofwhichtouse,basedonthefilename,butyoucangetmoredetailedinformationbylookingthroughtheper-boardREADMEfilesintheboard/directory,oryoucanfindinformationinanappropriatewebtutorialorforum.
TakingtheBeagleBoneBlackasanexample,wefindthatthereisalikelyconfigurationfilenamedconfigs/am335x_boneblack_defconfigandwefindthetextThebinaryproducedbythisboardsupports…BeagleboneBlackintheboardREADMEfilesfortheam335xchip,board/ti/am335x/README.Withthisknowledge,buildingU-BootforaBeagleBoneBlackissimple.YouneedtoinformU-BootoftheprefixforyourcrosscompilerbysettingthemakevariableCROSS_COMPILE,andthenselectingtheconfigurationfileusingacommandofthetypemake[board]_defconfig.Therefore,tobuildU-BootusingtheCrosstool-NGcompilerwecreatedinChapter2,LearningAboutToolchains,youwouldtype:
$sourceMELP/chapter_02/set-path-arm-cortex_a8-linux-gnueabihf
$makeCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-am335x_boneblack_defconfig
$makeCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-
Theresultsofthecompilationare:
u-boot:U-BootinELFobjectformat,suitableforusewithadebuggeru-boot.map:Thesymboltableu-boot.bin:U-Bootinrawbinaryformat,suitableforrunningonyourdevice
u-boot.img:Thisisu-boot.binwithaU-Bootheaderadded,suitableforuploadingtoarunningcopyofU-Bootu-boot.srec:U-BootinMotorolaS-record(SRECORDorSRE)format,suitablefortransferringoveraserialconnection
TheBeagleBoneBlackalsorequiresasecondaryprogramloader(SPL),asdescribedearlier.ThisisbuiltatthesametimeandisnamedMLO:
$ls-lMLOu-boot*
-rw-rw-r--1chrischris78416Mar910:13u-boot/MLO
-rwxrwxr-x1chrischris2943940Mar910:13u-boot/u-boot
-rwxrwxr-x1chrischris368348Mar910:13u-boot/u-boot.bin
-rw-rw-r--1chrischris368412Mar910:13u-boot/u-boot.img
-rw-rw-r--1chrischris520741Mar910:13u-boot/u-boot.map
-rwxrwxr-x1chrischris1105162Mar910:13u-boot/u-boot.srec
Theprocedureissimilarforothertargets.
InstallingU-BootInstallingabootloaderonaboardforthefirsttimerequiressomeoutsideassistance.Iftheboardhasahardwaredebuginterface,suchasJTAG,itisusuallypossibletoloadacopyofU-BootdirectlyintoRAMandsetitrunning.Fromthatpoint,youcanuseU-Bootcommandstocopyitselfintoflashmemory.Thedetailsofthisareveryboard-specificandoutsidethescopeofthisbook.
ManySoCdesignshaveabootROMbuiltin,whichcanbeusedtoreadbootcodefromvariousexternalsources,suchasSDcards,serialinterfaces,orUSBmassstorage.Thisisthecasewiththeam335xchipintheBeagleBoneBlack,whichmakesiteasytotryoutnewsoftware.
YouwillneedanSDcardreadertowritetheimagestoacard.Therearetwotypes:externalreadersthatplugintoaUSBport,andtheinternalSDreadersthatarepresentonmanylaptops.AdevicenameisassignedbyLinuxwhenacardispluggedintothereader.Thecommandlsblkisausefultooltofindoutwhichdevicehasbeenallocated.Forexample,thisiswhatIseewhenIpluganominal8GBmicroSDcardintomycardreader:
$lsblk
NAMEMAJ:MINRMSIZEROTYPEMOUNTPOINT
sda8:00477G0disk
├─sda18:10500M0part/boot/efi
├─sda28:2040M0part
├─sda38:303G0part
├─sda48:40457.6G0part/
└─sda58:5015.8G0part[SWAP]
sdb8:1617.2G0disk
└─sdb18:1717.2G0part/media/chris/101F-5626
Inthiscase,sdaismy512GBharddriveandsdbisthemicroSDcard.Ithasasinglepartition,sdb1,whichismountedasdirectory/media/chris/101F-5626.
AlthoughthemicroSDcardhad8GBprintedontheoutside,itwasonly7.2GBontheinside.Inpart,thisisbecauseofthedifferentunitsused.TheadvertisedcapacityismeasuredinGigabytes,109,butthesizesreportedbysoftwareareinGibibytes,230.GigabytesareabbreviatedGB,GibibytesasGiB.ThesameappliesforKB
andKiB,andMBandMiB.Inthisbook,Ihavetriedtousetherightunits.InthecaseoftheSDcard,itsohappensthat8Gigabytesisapproximately7.4Gibibytes.Theremainingdiscrepancyisbecauseflashmemoryalwayshastoreservesomespaceforbadblockhandling.ThisisatopicthatIwillreturntoinChapter7,CreatingaStorageStrategy.
IfIusethebuilt-inSDcardslot,Iseethis:
$lsblk
NAMEMAJ:MINRMSIZEROTYPEMOUNTPOINT
sda8:00477G0disk
├─sda18:10500M0part/boot/efi
├─sda28:2040M0part
├─sda38:303G0part
├─sda48:40457.6G0part/
└─sda58:5015.8G0part[SWAP]
mmcblk0179:007.2G0disk
└─mmcblk0p1179:107.2G0part/media/chris/101F-5626
Inthiscase,themicroSDcardappearsasmmcblk0andthepartitionismmcblk0p1.NotethatthemicroSDcardyouusemayhavebeenformatteddifferentlytothisoneandsoyoumayseeadifferentnumberofpartitionswithdifferentmountpoints.WhenformattinganSDcard,itisveryimportanttobesureofitsdevicename.Youreallydon'twanttomistakeyourharddriveforanSDcardandformatthatinstead.Thishashappenedtomemorethanonce.So,Ihaveprovidedashellscriptinthebook'scodearchivenamedMELP/format-sdcard.sh,whichhasareasonablenumberofcheckstopreventyou(andme)fromusingthewrongdevicename.TheparameteristhedevicenameofthemicroSDcard,whichwouldbesdbinthefirstexampleandmmcblk0inthesecond.Hereisanexampleofitsuse:
$MELP/format-sdcard.shmmcblk0
Thescriptcreatestwopartitions:thefirstis64MiB,formattedasFAT32,andwillcontainthebootloader,andthesecondis1GiB,formattedasext4,whichyouwilluseinChapter5,BuildingaRootFilesystem.
AfteryouhaveformattedthemicroSDcard,removeitfromthecardreaderandthenre-insertitsothatthepartitionsareautomounted.OncurrentversionsofUbuntu,thetwopartitionsshouldbemountedas/media/[user]/bootand/media/[user]/rootfs.NowyoucancopytheSPLandU-Boottoitlikethis:
$cpMLOu-boot.img/media/chris/boot
Finally,unmountit:
$sudoumount/media/chris/boot
Now,withnopowerontheBeagleBoneboard,insertthemicro-SDcardintothereader.Plugintheserialcable.AserialportshouldappearonyourPCas/dev/ttyUSB0.Startasuitableterminalprogram,suchasgtkterm,minicom,orpicocom,andattachtotheportat115200bps(bitspersecond)withnoflowcontrol.gtktermisprobablytheeasiesttosetupanduse:
$gtkterm-p/dev/ttyUSB0-s115200
PressandholdtheBootSwitchbuttonontheBeagleboneBlack,poweruptheboardusingtheexternal5Vpowerconnector,andreleasethebuttonafterabout5seconds.YoushouldseeaU-Bootpromptontheserialconsole:
U-Boot#
UsingU-BootInthissection,IwilldescribesomeofthecommontasksthatyoucanuseU-Boottoperform.
Usually,U-Bootoffersacommand-lineinterfaceoveraserialport.ItgivesaCommandPromptwhichiscustomizedforeachboard.Intheexamples,IwilluseU-Boot#.TypinghelpprintsoutallthecommandsconfiguredinthisversionofU-Boot;typinghelp<command>printsoutmoreinformationaboutaparticularcommand.
ThedefaultcommandinterpreterfortheBeagleBoneBlackisquitesimple.Youcannotdocommand-lineeditingbypressingcursorleftorrightkeys;thereisnocommandcompletionbypressingtheTabkey;andthereisnocommandhistorybypressingthecursorupkey.Pressinganyofthesekeyswilldisruptthecommandyouarecurrentlytryingtotype,andyouwillhavetotypeCtrl+Candstartoveragain.Theonlylineeditingkeyyoucansafelyuseisthebackspace.Asanoption,youcanconfigureadifferentcommandshellcalledHush,whichhasmoresophisticatedinteractivesupport,includingcommand-lineediting.
Thedefaultnumberformatishexadecimal.Considerthefollowingcommandasanexample:
nandread82000000400000200000
Thiswillread0x200000bytesfromoffset0x400000fromthestartoftheNANDflashmemoryintoRAMaddress0x82000000.
EnvironmentvariablesU-Bootusesenvironmentvariablesextensivelytostoreandpassinformationbetweenfunctionsandeventocreatescripts.Environmentvariablesaresimplename=valuepairsthatarestoredinanareaofmemory.Theinitialpopulationofvariablesmaybecodedintheboardconfigurationheaderfile,likethis:
#defineCONFIG_EXTRA_ENV_SETTINGS
"myvar1=value1"
"myvar2=value2"
[...]
YoucancreateandmodifyvariablesfromtheU-Bootcommandlineusingsetenv.Forexample,setenvfoobarcreatesthevariablefoowiththevaluebar.Notethatthereisno=signbetweenthevariablenameandthevalue.Youcandeleteavariablebysettingittoanullstring,setenvfoo.Youcanprintallthevariablestotheconsoleusingprintenv,orasinglevariableusingprintenvfoo.
IfU-Boothasbeenconfiguredwithspacetostoretheenvironment,youcanusethesaveenvcommandtosaveit.IfthereisrawNANDorNORflash,thenaneraseblockcanbereservedforthispurpose,oftenwithanotherusedforaredundantcopytoguardagainstcorruption.IfthereiseMMCorSDcardstorage,itcanbestoredinareservedarrayofsectors,orinafilenameduboot.envinapartitionofthedisk.OtheroptionsincludestoringinaserialEEPROMconnectedviaanI2CorSPIinterfaceornon-volatileRAM.
BootimageformatU-Bootdoesn'thaveafilesystem.Instead,ittagsblocksofinformationwitha64-byteheadersothatitcantrackthecontents.YoupreparefilesforU-Bootusingthemkimagecommand.Hereisabriefsummaryofitsusage:
$mkimage
Usage:mkimage-limage
-l==>listimageheaderinformation
mkimage[-x]-Aarch-Oos-Ttype-Ccomp-aaddr-eep-nname-ddata_file[:data_file...]image
-A==>setarchitectureto'arch'
-O==>setoperatingsystemto'os'
-T==>setimagetypeto'type'
-C==>setcompressiontype'comp'
-a==>setloadaddressto'addr'(hex)
-e==>setentrypointto'ep'(hex)
-n==>setimagenameto'name'
-d==>useimagedatafrom'datafile'
-x==>setXIP(executeinplace)
mkimage[-Ddtc_options][-ffit-image.its|-F]fit-image
-D=>setoptionsfordevicetreecompiler
-f=>inputfilenameforFITsource
Signing/verifiedbootnotsupported(CONFIG_FIT_SIGNATUREundefined)
mkimage-V==>printversioninformationandexit
Forexample,toprepareakernelimageforanARMprocessor,thecommandis:
$mkimage-Aarm-Olinux-Tkernel-Cgzip-a0x80008000-e0x80008000\
-n'Linux'-dzImageuImage
LoadingimagesUsually,youwillloadimagesfromremovablestorage,suchasanSDcardoranetwork.SDcardsarehandledinU-Bootbythemmcdriver.Atypicalsequencetoloadanimageintomemorywouldbe:
U-Boot#mmcrescan
U-Boot#fatloadmmc0:182000000uimage
readinguimage
4605000bytesreadin254ms(17.3MiB/s)
U-Boot#iminfo82000000
##CheckingImageat82000000...
Legacyimagefound
ImageName:Linux-3.18.0C
reated:2014-12-2321:08:07UTC
ImageType:ARMLinuxKernelImage(uncompressed)
DataSize:4604936Bytes=4.4MiB
LoadAddress:80008000
EntryPoint:80008000
VerifyingChecksum...OK
Thecommandmmcrescanre-initializesthemmcdriver,perhapstodetectthatanSDcardhasrecentlybeeninserted.Next,fatloadisusedtoreadafilefromaFAT-formattedpartitionontheSDcard.Theformatis:
fatload<interface>[<dev[:part]>[<addr>[<filename>[bytes[pos]]]]]
If<interface>ismmc,asinourcase,<dev:part>isthedevicenumberofthemmcinterfacecountingfromzero,andthepartitionnumbercountingfromone.Hence,<0:1>isthefirstpartitiononthefirstdevice.Thememorylocation,0x82000000,ischosentobeinanareaofRAMthatisnotbeingusedatthismoment.Ifweintendtobootthiskernel,wehavetomakesurethatthisareaofRAMwillnotbeoverwrittenwhenthekernelimageisdecompressedandlocatedattheruntimelocation,0x80008000.
Toloadimagefilesoveranetwork,youusetheTrivialFileTransferProtocol(TFTP).ThisrequiresyoutoinstallaTFTPdaemon,tftpd,onyourdevelopmentsystemandstartitrunning.YoualsohavetoconfigureanyfirewallsbetweenyourPCandthetargetboardtoallowtheTFTPprotocolonUDPport69topassthrough.ThedefaultconfigurationofTFTPallowsaccessonlytothedirectory/var/lib/tftpboot.Thenextstepistocopythefilesyouwanttotransfertothe
targetintothatdirectory.Then,assumingthatyouareusingapairofstaticIPaddresses,whichremovestheneedforfurthernetworkadministration,thesequenceofcommandstoloadasetofkernelimagefilesshouldlooklikethis:
U-Boot#setenvipaddr192.168.159.42
U-Boot#setenvserverip192.168.159.99
U-Boot#tftp82000000uImage
linkuponport0,speed100,fullduplex
Usingcpswdevice
TFTPfromserver192.168.159.99;ourIPaddressis192.168.159.42
Filename'uImage'.
Loadaddress:0x82000000
Loading:
#################################################################
#################################################################
#################################################################
######################################################
######################################################
3MiB/s
done
Bytestransferred=4605000(464448hex)
Finally,let'slookathowtoprogramimagesintoNANDflashmemoryandreadthemback,whichishandledbythenandcommand.ThisexampleloadsakernelimageviaTFTPandprogramsitintoflash:
U-Boot#tftpboot82000000uimage
U-Boot#nandecchw
U-Boot#nanderase280000400000
NANDerase:device0offset0x280000,size0x400000
Erasingat0x660000--100%complete.
OK
U-Boot#nandwrite82000000280000400000
NANDwrite:device0offset0x280000,size0x400000
4194304byteswritten:OK
Nowyoucanloadthekernelfromflashmemoryusingthenandreadcommand:
U-Boot#nandread82000000280000400000
BootingLinuxThebootmcommandstartsakernelimagerunning.Thesyntaxis:
bootm[addressofkernel][addressoframdisk][addressofdtb].
Theaddressofthekernelimageisnecessary,buttheaddressoframdiskanddtbcanbeomittedifthekernelconfigurationdoesnotneedthem.Ifthereisdtbbutnoinitramfs,thesecondaddresscanbereplacedwithadash(-).Thatwouldlooklikethis:
U-Boot#bootm82000000-83000000
AutomatingthebootwithU-BootscriptsPlainly,typingalongseriesofcommandstobootyourboardeachtimeitisturnedonisnotacceptable.Toautomatetheprocess,U-Bootstoresasequenceofcommandsinenvironmentvariables.Ifthespecialvariablenamedbootcmdcontainsascript,itisrunatpower-upafteradelayofbootdelayseconds.Ifyouwatchthisontheserialconsole,youwillseethedelaycountingdowntozero.YoucanpressanykeyduringthisperiodtoterminatethecountdownandenterintoaninteractivesessionwithU-Boot.
Thewaythatyoucreatescriptsissimple,thoughnoteasytoread.Yousimplyappendcommandsseparatedbysemicolons,whichmustbeprecededbyabackslashescapecharacter.So,forexample,toloadakernelimagefromanoffsetinflashmemoryandbootit,youmightusethefollowingcommand:
setenvbootcmdnandread82000000400000200000\;bootm82000000
PortingU-BoottoanewboardLet'sassumethatyourhardwaredepartmenthascreatedanewboardcalledNovathatisbasedontheBeagleBoneBlackandthatyouneedtoportU-Boottoit.YouwillneedtounderstandthelayoutoftheU-Bootcodeandhowtheboardconfigurationmechanismworks.Inthissection,Iwillshowyouhowtocreateavariantofanexistingboard—theBeagleBoneBlack—whichyoucouldgoontouseasthebasisforfurthercustomizations.Therearequiteafewfilesthatneedtobechanged.IhaveputthemtogetherintoapatchfileinthecodearchiveinMELP/chapter_03/0001-BSP-for-Nova.patch.YoucansimplyapplythatpatchtoacleancopyofU-Bootversion2017.01likethis:
$cdu-boot
$patch-p1<MELP/chapter_03/0001-BSP-for-Nova.patch
IfyouwanttouseadifferentversionofU-Boot,youwillhavetomakesomechangestothepatchforittoapplycleanly.
Theremainderofthissectionisadescriptionofhowthepatchwascreated.Ifyouwanttofollowalongstep-by-step,youwillneedacleancopyofU-Boot2017.01withouttheNovaBSPpatch.Themaindirectorieswewillbedealingwithare:
arch:Containscodespecifictoeachsupportedarchitectureindirectoriesarm,mips,powerpc,andsoon.Withineacharchitecture,thereisasubdirectoryforeachmemberofthefamily;forexample,inarch/arm/cpu/,therearedirectoriesforthearchitecturevariants,includingamt926ejs,armv7,andarmv8.board:Containscodespecifictoaboard.Wherethereareseveralboardsfromthesamevendor,theycanbecollectedtogetherintoasubdirectory.Hence,thesupportfortheam335xevmboard,onwhichtheBeagleBoneisbased,isinboard/ti/am335x.common:Containscorefunctionsincludingthecommandshellsandthecommandsthatcanbecalledfromthem,eachinafilenamedcmd_[commandname].c.doc:ContainsseveralREADMEfilesdescribingvariousaspectsofU-Boot.IfyouarewonderinghowtoproceedwithyourU-Bootport,thisisagood
placetostart.include:Inadditiontomanysharedheaderfiles,thiscontainstheveryimportantsubdirectoryinclude/configs/whereyouwillfindthemajorityoftheboardconfigurationsettings.
ThewaythatKconfigextractsconfigurationinformationfromKconfigfilesandstoresthetotalsystemconfigurationinafilenamed.configisdescribedinsomedetailinChapter4,ConfiguringandBuildingtheKernel.Eachboardhasadefaultconfigurationstoredinconfigs/[boardname]_defconfig.FortheNovaboard,wecanbeginbymakingacopyoftheconfigurationfortheBeagleBoneBlack:
$cpconfigs/am335x_boneblack_defconfigconfigs/nova_defconfig
Noweditconfigs/nova_defconfigandchangelinefourfromCONFIG_TARGET_AM335X_EVM=ytoCONFIG_TARGET_NOVA=y:
1CONFIG_ARM=y
2CONFIG_AM33XX=y
3#CONFIG_SPL_NAND_SUPPORTisnotset
4CONFIG_TARGET_NOVA=y
[...]
NotethatCONFIG_ARM=ycausesthecontentsofarch/arm/Kconfigtobeincluded,andonlinetwo,CONFIG_AM33XX=ycausesarch/arm/mach-omap2/am33xx/Kconfigtobeincluded.
Board-specificfilesEachboardhasasubdirectorynamedboard/[boardname]orboard/[vendor]/[boardname],whichshouldcontain:
Kconfig:ContainsconfigurationoptionsfortheboardMAINTAINERS:Containsarecordofwhethertheboardiscurrentlymaintainedand,ifso,bywhomMakefile:Usedtobuildtheboard-specificcodeREADME:ContainsanyusefulinformationaboutthisportofU-Boot;forexample,whichhardwarevariantsarecovered
Inaddition,theremaybesourcefilesforboardspecificfunctions.
OurNovaboardisbasedonaBeagleBonewhich,inturn,isbasedonaTIam335xEVM,so,weshouldtakeacopyoftheam335xboardfiles:
$mkdirboard/ti/nova
$cp-aboard/ti/am335x/*board/ti/nova
Next,editboard/ti/nova/KconfigandsetSYS_BOARDto"nova",sothatitwillbuildthefilesinboard/ti/nova,andsetSYS_CONFIG_NAMEto"nova"also,sothattheconfigurationfileusedwillbeinclude/configs/nova.h:
1ifTARGET_NOVA
2
3configSPL_ENV_SUPPORT
4defaulty
5
6configSPL_WATCHDOG_SUPPORT
7defaulty
8
9configSPL_YMODEM_SUPPORT
10defaulty
11
12configSYS_BOARD
13default"nova"
14
15configSYS_VENDOR
16default"ti"
17
18configSYS_SOC
19default"am33xx"
20
21configSYS_CONFIG_NAME
22default"nova"
[...]
Thereisoneotherfileherethatweneedtochange.Thelinkerscriptplacedatboard/ti/nova/u-boot.ldshasahard-codedreferencetoboard/ti/am335x/built-in.oonline39.Changeitasshown:
35{
36*(.__image_copy_start)
37*(.vectors)
38CPUDIR/start.o(.text*)
39board/ti/nova/built-in.o(.text*)
40*(.text*)
41}
NowweneedtolinktheKconfigfileforNovaintothechainofKconfigfiles.First,editarch/arm/KconfigandaddamenuoptionforNova,andthensourceitsKconfigfile:
[...]
1069source"board/ti/nova/Kconfig"
[...]
Then,editarch/arm/mach-omap2/am33xx/KconfigandaddaconfigurationoptionforTARGET_NOVA:
[...]
21configTARGET_NOVA
22bool"SupporttheNova!board"
23selectDM
24selectDM_SERIAL
25selectDM_GPIO
26selectTI_I2C_BOARD_DETECT
27help
28TheNovatargetboard
[...]
ConfiguringheaderfilesEachboardhasaheaderfileininclude/configs/whichcontainsthemajorityoftheconfigurationinformation.ThefileisnamedbytheSYS_CONFIG_NAMEidentifierintheboard'sKconfig.TheformatofthisfileisdescribedindetailintheREADMEfileatthetopleveloftheU-Bootsourcetree.ForthepurposesofourNovaboard,simplycopyinclude/configs/am335c_evm.htoinclude/configs/nova.handmakeasmallnumberofchanges,themostsignificantofwhichistosetanewCommandPromptsothatwecanidentifythisbootloaderatrun-time:
[...]
16#ifndef__CONFIG_NOVA_H
17#define__CONFIG_NOVA_H
[...]
38#defineCONFIG_SYS_LDSCRIPT"board/ti/nova/u-boot.lds"
[...]
68#undefCONFIG_SYS_PROMPT
69#defineCONFIG_SYS_PROMPT"nova!>"
[...]
421#endif/*!__CONFIG_NOVA_H*/
BuildingandtestingTobuildfortheNovaboard,selecttheconfigurationyouhavejustcreated:
$makeCROSS_COMPILE=arm-cortex_a8-linux-gnueabi-distclean
$makeCROSS_COMPILE=arm-cortex_a8-linux-gnueabi-nova_defconfig
$makeCROSS_COMPILE=arm-cortex_a8-linux-gnueabi-
CopyMLOandu-boot.imgtothebootpartitionofthemicroSDcardyoucreatedearlierandboottheboard.Youshouldseeoutputlikethis(notetheCommandPrompt):
U-BootSPL2017.01-dirty(Apr202017-16:48:38)
TryingtobootfromMMC1MMCpartitionswitchfailed
***Warning-MMCpartitionswitchfailed,usingdefaultenvironment
readingu-boot.img
readingu-boot.img
U-Boot2017.01-dirty(Apr202017-16:48:38+0100)
CPU:AM335X-GPrev2.0
I2C:ready
DRAM:512MiB
MMC:OMAPSD/MMC:0,OMAPSD/MMC:1
***Warning-badCRC,usingdefaultenvironment
<ethaddr>notset.ValidatingfirstE-fuseMAC
Net:cpsw,usb_ether
PressSPACEtoabortautobootin2seconds
nova!>
YoucancreateapatchforallofthesechangesbycheckingthemintoGitandusingthegitformat-patchcommand:
$gitadd.
$gitcommit-m"BSPforNova"
[nova-bsp-2e160f82]BSPforNova
12fileschanged,2272insertions(+)
createmode100644board/ti/nova/Kconfig
createmode100644board/ti/nova/MAINTAINERS
createmode100644board/ti/nova/Makefile
createmode100644board/ti/nova/README
createmode100644board/ti/nova/board.c
createmode100644board/ti/nova/board.h
createmode100644board/ti/nova/mux.c
createmode100644board/ti/nova/u-boot.lds
createmode100644configs/nova_defconfig
createmode100644include/configs/nova.h
$gitformat-patch-1
0001-BSP-for-Nova.patch
FalconmodeWeareusedtotheideathatbootingamodernembeddedprocessorinvolvestheCPUbootROMloadinganSPL,whichloadsu-boot.binwhichthenloadsaLinuxkernel.Youmaybewonderingifthereisawaytoreducethenumberofsteps,therebysimplifyingandspeedingupthebootprocess.TheanswerisU-BootFalconmode.Theideaissimple:havetheSPLloadakernelimagedirectly,missingoutu-boot.bin.Thereisnouserinteractionandtherearenoscripts.ItjustloadsakernelfromaknownlocationinflashoreMMCintomemory,passesitapre-preparedparameterblock,andstartsitrunning.ThedetailsofconfiguringFalconmodearebeyondthescopeofthisbook.Ifyouwouldlikemoreinformation,takealookatdoc/README.falcon.
FalconmodeisnamedafterthePeregrinefalcon,whichisthefastestbirdofall,capableofreachingspeedsofmorethan200milesperhourinadive.
BareboxIwillcompletethischapterwithalookatanotherbootloaderthathasthesamerootsasU-Bootbuttakesanewapproachtobootloaders.ItisderivedfromU-BootandwasactuallycalledU-Bootv2intheearlydays.ThebareboxdevelopersaimedtocombinethebestpartsofU-BootandLinux,includingaPOSIX-likeAPIandmountablefilesystems.
Thebareboxprojectwebsiteishttp://barebox.org/[email protected].
GettingbareboxTogetbarebox,clonetheGitrepositoryandcheckouttheversionyouwanttouse:
$gitclonegit://git.pengutronix.de/git/barebox.git
$cdbarebox
$gitcheckoutv2017.02.0
ThelayoutofthecodeissimilartoU-Boot:
arch:Containscodespecifictoeachsupportedarchitecture,whichincludesallthemajorembeddedarchitectures.SoCsupportisinarch/[architecture]/mach-[SoC].Supportforindividualboardsisinarch/[architecture]/boards.common:Containscorefunctions,includingtheshell.commands:Containsthecommandsthatcanbecalledfromtheshell.Documentation:Containsthetemplatesfordocumentationfiles.Tobuildit,typemakedocs.TheresultsareputinDocumentation/html.drivers:Containsthecodeforthedevicedrivers.include:Containsheaderfiles.
BuildingbareboxBareboxhasusedKconfig/Kbuildforalongtime.Therearedefaultconfigurationfilesinarch/[architecture]/configs.Asanexample,assumethatyouwanttobuildbareboxfortheBeagleBoardC4.Youneedtwoconfigurations,onefortheSPL,andoneforthemainbinary.Firstly,buildMLO:
$makeARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-\
am335x_mlo_defconfig
$makeARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-
Theresultisthesecondaryprogramloader,images/barebox-am33xx-beaglebone-mlo.img.
Next,buildbarebox:
$makeARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-\
am335x_defconfig
$makeARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-
CopyMLOandthebareboxbinarytoanSDcard:
$cpimages/barebox-am33xx-beaglebone-mlo.img/media/chris/boot/MLO
$cpimages/barebox-am33xx-beaglebone.img/media/chris/boot/barebox.bin
Then,bootuptheboardandyoushouldseemessagesliketheseontheconsole:
barebox2017.02.0#1ThuMar920:27:08GMT2017
Board:TIAM335xBeagleBoneblack
detected'BeagleBoneBlack'
[...]
running/env/bin/init...
changingUSBcurrentlimitto1300mA...done
Hitmformenuoranyotherkeytostopautoboot:3
typeexittogettothemenu
barebox@TIAM335xBeagleBoneblack:/
UsingbareboxUsingbareboxatthecommandlineyoucanseethesimilaritieswithLinux.First,youcanseethattherearefilesystemcommandssuchasls,andthereisa/devdirectory:
#ls/dev
fullmdio0-phy00memmmc0mmc0.0
mmc0.1mmc1mmc1.0nullram0zero
Thedevice/dev/mmc0.0isthefirstpartitiononthemicroSDcard,whichcontainsthekernelandinitialramdisk.Youcanmountitlikethis:
#mount/dev/mmc0.0/mnt
Nowyoucanseethefiles:
#ls/mnt
MLOam335x-boneblack.dtbbarebox.bin
u-boot.imguRamdiskzImage
Bootfromtherootpartition:
#global.bootm.oftree=/mnt/am335x-boneblack.dtb
#globallinux.bootargs.root="root=/dev/mmcblk0p2rootwait"
#bootm/mnt/zImage
SummaryEverysystemneedsabootloadertobringthehardwaretolifeandtoloadakernel.U-Boothasfoundfavorwithmanydevelopersbecauseitsupportsausefulrangeofhardwareanditisfairlyeasytoporttoanewdevice.Overthelastfewyears,thecomplexityandeverincreasingvarietyofembeddedhardwarehasledtotheintroductionofthedevicetreeasawayofdescribinghardware.Thedevicetreeissimplyatextualrepresentationofasystemthatiscompiledintoadevicetreebinary(dtb)andwhichispassedtothekernelwhenitloads.Itisuptothekerneltointerpretthedevicetreeandtoloadandinitializedriversforthedevicesitfindsthere.
Inuse,U-Bootisveryflexible,allowingimagestobeloadedfrommassstorage,flashmemory,oranetwork,andbooted.Likewise,bareboxcanachievethesamebutwithasmallerbaseofhardwaresupport.DespiteitscleanerdesignandPOSIX-inspiredinternalAPIs,atthetimeofwritingitdoesnotseemtohavebeenacceptedbeyonditsownsmallbutdedicatedcommunity.
HavingcoveredsomeoftheintricaciesofbootingLinux,inthenextchapteryouwillseethenextstageoftheprocessasthethirdelementofyourembeddedproject,thekernel,comesintoplay.
ConfiguringandBuildingtheKernelThekernelisthethirdelementofembeddedLinux.Itisthecomponentthatisresponsibleformanagingresourcesandinterfacingwithhardware,andsoaffectsalmosteveryaspectofyourfinalsoftwarebuild.Itisusuallytailoredtoyourparticularhardwareconfiguration,although,aswesawinChapter3,AllAboutBootloaders,devicetreesallowyoutocreateagenerickernelthatistailoredtoparticularhardwarebythecontentsofthedevicetree.
Inthischapter,wewilllookathowtogetakernelforaboard,andhowtoconfigureandcompileit.Wewilllookagainatbootstrap,thistimefocusingonthepartthekernelplays.Wewillalsolookatdevicedriversandhowtheypickupinformationfromthedevicetree.
Inthischapter,wewillcoverthefollowingtopics:
Whatdoesthekerneldo?Choosingakernel.Buildingthekernel.Bootingthekernel.PortingLinuxtoanewboard.
Whatdoesthekerneldo?Linuxbeganin1991,whenLinusTorvaldsstartedwritinganoperatingsystemforIntel386-and486-basedpersonalcomputers.HewasinspiredbytheMinixoperatingsystemwrittenbyAndrewS.Tanenbaumfouryearsearlier.LinuxdifferedinmanywaysfromMinix;themaindifferencesbeingthatitwasa32-bitvirtualmemorykernelandthecodewasopensource,laterreleasedundertheGPLv2license.Heannouncediton25thAugust,1991,onthecomp.os.minixnewsgroupinafamouspostthatbeganwith:
Helloeverybodyoutthereusingminix—I'mdoinga(free)operatingsystem(justahobby,won'tbebigandprofessionallikeGNU)for386(486)ATclones.ThishasbeenbrewingsinceApril,andisstartingtogetready.I'dlikeanyfeedbackonthingspeoplelike/dislikeinminix,asmyOSresemblesitsomewhat(samephysicallayoutofthefilesystem(duetopracticalreasons)amongotherthings).
Tobestrictlyaccurate,Linusdidnotwriteanoperatingsystem,ratherhewroteakernel,whichisonlyonecomponentofanoperatingsystem.Tocreateacompleteoperatingsystemwithuserspacecommandsandashellcommandinterpreter,heusedcomponentsfromtheGNUproject,especiallythetoolchain,theC-library,andbasiccommand-linetools.Thatdistinctionremainstoday,andgivesLinuxalotofflexibilityinthewayitisused.ItcanbecombinedwithaGNUuserspacetocreateafullLinuxdistributionthatrunsondesktopsandservers,whichissometimescalledGNU/Linux;itcanbecombinedwithanAndroiduserspacetocreatethewell-knownmobileoperatingsystem,oritcanbecombinedwithasmallBusyBox-baseduserspacetocreateacompactembeddedsystem.ContrastthiswiththeBSDoperatingsystems,FreeBSD,OpenBSD,andNetBSD,inwhichthekernel,thetoolchain,andtheuserspacearecombinedintoasinglecodebase.
Thekernelhasthreemainjobs:tomanageresources,tointerfacewithhardware,andtoprovideanAPIthatoffersausefullevelofabstractiontouserspaceprograms,assummarizedinthefollowingdiagram:
ApplicationsrunninginUserspacerunatalowCPUprivilegelevel.Theycandoverylittleotherthanmakelibrarycalls.TheprimaryinterfacebetweentheUserspaceandtheKernelspaceistheClibrary,whichtranslatesuserlevelfunctions,suchasthosedefinedbyPOSIX,intokernelsystemcalls.Thesystemcallinterfaceusesanarchitecture-specificmethod,suchasatraporasoftwareinterrupt,toswitchtheCPUfromlowprivilegeusermodetohighprivilegekernelmode,whichallowsaccesstoallmemoryaddressesandCPUregisters.
TheSystemcallhandlerdispatchesthecalltotheappropriatekernelsubsystem:memoryallocationcallsgotothememorymanager,filesystemcallstothefilesystemcode,andsoon.Someofthosecallsrequireinputfromtheunderlyinghardwareandwillbepasseddowntoadevicedriver.Insomecases,thehardwareitselfinvokesakernelfunctionbyraisinganinterrupt.
Theprecedingdiagramshowsthatthereisasecondentrypointintokernelcode:hardwareinterrupts.Interruptscanonlybehandledinadevicedriver,neverbyauserspaceapplication.
Inotherwords,alltheusefulthingsthatyourapplicationdoes,itdoesthemthroughthekernel.Thekernel,then,isoneofthemostimportantelementsinthesystem.
ChoosingakernelThenextstepistochoosethekernelforyourproject,balancingthedesiretoalwaysusethelatestversionofsoftwareagainsttheneedforvendor-specificadditionsandaninterestinthelongtermsupportofthecodebase.
KerneldevelopmentcycleLinuxisdevelopedatafastpace,withanewversionbeingreleasedevery8to12weeks.Thewaythattheversionnumbersareconstructedhaschangedabitinrecentyears.BeforeJuly2011,therewasathreenumberversionschemewithversionnumbersthatlookedlike2.6.39.Themiddlenumberindicatedwhetheritwasadeveloperorstablerelease;oddnumbers(2.1.x,2.3.x,2.5.x)werefordevelopersandevennumberswereforendusers.Fromversion2.6onwards,theideaofalong-liveddevelopmentbranch(theoddnumbers)wasdropped,asitsloweddowntherateatwhichnewfeaturesweremadeavailabletotheusers.Thechangeinnumberingfrom2.6.39to3.0inJuly2011waspurelybecauseLinusfeltthatthenumberswerebecomingtoolarge;therewasnohugeleapinthefeaturesorarchitectureofLinuxbetweenthosetwoversions.Healsotooktheopportunitytodropthemiddlenumber.Sincethen,inApril2015,hebumpedthemajorfrom3to4,againpurelyforneatness,notbecauseofanylargearchitecturalshift.
Linusmanagesthedevelopmentkerneltree.YoucanfollowhimbycloningtheGittreelikeso:
$gitclonegit://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Thiswillcheckoutintosubdirectorylinux.Youcankeepuptodatebyrunningthecommandgitpullinthatdirectoryfromtimetotime.
Currently,afullcycleofkerneldevelopmentbeginswithamergewindowoftwoweeks,duringwhichLinuswillacceptpatchesfornewfeatures.Attheendofthemergewindow,astabilizationphasebegins,duringwhichLinuswillproduceweeklyreleasecandidateswithversionnumbersendingin-rc1,-rc2,andsoon,usuallyupto-rc7or-rc8.Duringthistime,peopletestthecandidatesandsubmitbugreportsandfixes.Whenallsignificantbugshavebeenfixed,thekernelisreleased.
Thecodeincorporatedduringthemergewindowhastobefairlymaturealready.Usually,itispulledfromtherepositoriesofthemanysubsystemandarchitecturemaintainersofthekernel.Bykeepingtoashortdevelopmentcycle,featurescan
bemergedwhentheyareready.Ifafeatureisdeemednotsufficientlystableorwelldevelopedbythekernelmaintainers,itcansimplybedelayeduntilthenextrelease.
Keepingatrackofwhathaschangedfromreleasetoreleaseisnoteasy.YoucanreadthecommitloginLinus'Gitrepositorybut,withroughly10,000ormoreentries,itisnoteasytogetanoverview.Thankfully,thereistheLinuxKernelNewbieswebsite,http://kernelnewbies.org,whereyouwillfindasuccinctoverviewofeachversionathttp://kernelnewbies.org/LinuxVersions.
StableandlongtermsupportreleasesTherapidrateofchangeofLinuxisagoodthinginthatitbringsnewfeaturesintothemainlinecodebase,butitdoesnotfitverywellwiththelongerlifecycleofembeddedprojects.Kerneldevelopersaddressthisintwoways,withstablereleasesandlongtermreleases.Afterthereleaseofamainlinekernel(maintainedbyLinusTorvalds)itismovedtothestabletree(maintainedbyGregKroah-Hartman).Bugfixesareappliedtothestablekernel,whilethemainlinekernelbeginsthenextdevelopmentcycle.Pointreleasesofthestablekernelaremarkedbyathirdnumber,3.18.1,3.18.2,andsoon.Beforeversion3,therewerefourreleasenumbers,2.6.29.1,2.6.39.2,andsoon.
Youcangetthestabletreebyusingthefollowingcommand:
$gitclonegit://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Youcanusegitcheckouttogetaparticularversion,forexampleversion4.9.13:
$cdlinux-stable
$gitcheckoutv4.9.13
Usually,thestablekernelisupdatedonlyuntilthenextmainlinerelease(8to12weekslater),soyouwillseethatthereisjustoneorsometimestwostablekernelsathttps://www.kernel.org/.Tocaterforthoseuserswhowouldlikeupdatesforalongerperiodoftimeandbeassuredthatanybugswillbefoundandfixed,somekernelsarelabeledlongtermandmaintainedfortwoormoreyears.Thereisatleastonelongtermkernelreleaseeachyear.Lookingathttps://www.kernel.org/atthetimeofwriting,thereareatotalofninelongtermkernels:4.9,4.4,4.1,3.18,3.14,3.12,3.10,3.4,and3.2.Thelatterhasbeenmaintainedforfiveyearsandisatversion3.2.86.Ifyouarebuildingaproductthatyouwillhavetomaintainforthislengthoftime,thenthelatestlongtermkernelmightwellbeagoodchoice.
VendorsupportInanidealworld,youwouldbeabletodownloadakernelfromhttps://www.kernel.org/andconfigureitforanydevicethatclaimstosupportLinux.However,thatisnotalwayspossible;infactmainlineLinuxhassolidsupportforonlyasmallsubsetofthemanydevicesthatcanrunLinux.YoumayfindsupportforyourboardorSoCfromindependentopensourceprojects,LinaroortheYoctoProject,forexample,orfromcompaniesprovidingthirdpartysupportforembeddedLinux,butinmanycasesyouwillbeobligedtolooktothevendorofyourSoCorboardforaworkingkernel.Asweknow,somearebetteratsupportingLinuxthanothers.Myonlyadviceatthispointistochoosevendorswhogivegoodsupportorwho,evenbetter,takethetroubletogettheirkernelchangesintothemainline.
LicensingTheLinuxsourcecodeislicensedunderGPLv2,whichmeansthatyoumustmakethesourcecodeofyourkernelavailableinoneofthewaysspecifiedinthelicense.
TheactualtextofthelicenseforthekernelisinthefileCOPYING.ItbeginswithanaddendumwrittenbyLinusthatstatesthatcodecallingthekernelfromuserspaceviathesystemcallinterfaceisnotconsideredaderivativeworkofthekernelandsoisnotcoveredbythelicense.Hence,thereisnoproblemwithproprietaryapplicationsrunningontopofLinux.
However,thereisoneareaofLinuxlicensingthatcausesendlessconfusionanddebate:kernelmodules.Akernelmoduleissimplyapieceofcodethatisdynamicallylinkedwiththekernelatruntime,therebyextendingthefunctionalityofthekernel.TheGPLmakesnodistinctionbetweenstaticanddynamiclinking,soitwouldappearthatthesourceforkernelmodulesiscoveredbytheGPL.But,intheearlydaysofLinux,thereweredebatesaboutexceptionstothisrule,forexample,inconnectionwiththeAndrewfilesystem.ThiscodepredatesLinuxandtherefore(itwasargued)isnotaderivativework,andsothelicensedoesnotapply.Similardiscussionstookplaceovertheyearswithrespecttootherpiecesofcode,withtheresultthatitisnowacceptedpracticethattheGPLdoesnotnecessarilyapplytokernelmodules.ThisiscodifiedbythekernelMODULE_LICENSEmacro,whichmaytakethevalueProprietarytoindicatethatitisnotreleasedundertheGPL.Ifyouplantousethesameargumentsyourself,youmaywanttoreadthoughanoft-quotede-mailthreadtitled"LinuxGPLandbinarymoduleexceptionclause?"whichisarchivedathttp://yarchive.net/comp/linux/gpl_modules.html.
TheGPLshouldbeconsideredagoodthingbecauseitguaranteesthatwhenyouandIareworkingonembeddedprojects,wecanalwaysgetthesourcecodeforthekernel.Withoutit,embeddedLinuxwouldbemuchhardertouseandmorefragmented.
BuildingthekernelHavingdecidedwhichkerneltobaseyourbuildon,thenextstepistobuildit.
GettingthesourceBothofthetargetsusedinthisbook,theBeagleBoneBlackandtheARMVersatilePB,arewellsupportedbythemainlinekernel.Therefore,itmakessensetousethelatestlong-termkernelavailablefromhttps://www.kernel.org/,whichatthetimeofwritingwas4.9.13.Whenyoucometodothisforyourself,youshouldchecktoseeifthereisalaterversionofthe4.9kernelandusethatinsteadsinceitwillhavefixesforbugsfoundafter4.9.13wasreleased.Ifthereisalaterlong-termrelease,youmaywanttoconsiderusingthatone,butbeawarethattheremayhavebeenchangesthatmeanthatthefollowingsequenceofcommandsdonotworkexactlyasgiven.
Usethiscommandtoclonethestablekernelandcheckoutversion4.9.13:
$gitclonegit://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
$cdlinux-stable
$gitcheckoutv4.9.13
Alternatively,youcoulddownloadthetarfilefromhttps://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.9.13.tar.xz.
Thereisalotofcodehere.Thereareover57,000filesinthe4.9kernelcontainingC-sourcecode,headerfiles,andassemblycode,amountingtoatotalofover14millionlinesofcode,asmeasuredbytheSLOCCountutility.Nevertheless,itisworthknowingthebasiclayoutofthecodeandtoknow,approximately,wheretolookforaparticularcomponent.Themaindirectoriesofinterestare:
arch:Containsarchitecture-specificfiles.Thereisonesubdirectoryperarchitecture.Documentation:Containskerneldocumentation.AlwayslookherefirstifyouwanttofindmoreinformationaboutanaspectofLinux.drivers:Containsdevicedrivers,thousandsofthem.Thereisasubdirectoryforeachtypeofdriver.fs:Containsfilesystemcode.include:Containskernelheaderfiles,includingthoserequiredwhenbuildingthetoolchain.
init:Containsthekernelstart-upcode.kernel:Containscorefunctions,includingscheduling,locking,timers,powermanagement,anddebug/tracecode.mm:Containsmemorymanagement.net:Containsnetworkprotocols.scripts:Containsmanyusefulscripts,includingthedevicetreecompiler,DTC,whichIdescribedinChapter3,AllAboutBootloaders.tools:ContainsmanyusefultoolsandincludingtheLinuxperformancecounterstool,perf,whichIwilldescribeinChapter15,ProfilingandTracing.
Overaperiodoftime,youwillbecomefamiliarwiththisstructure,andrealizethatifyouarelookingforthecodefortheserialportofaparticularSoC,youwillfinditindrivers/tty/serialandnotinarch/$ARCH/mach-foo,becauseitisadevicedriverandnotsomethingcentraltotherunningofLinuxonthatSoC.
Understandingkernelconfiguration–KconfigOneofthestrengthsofLinuxisthedegreetowhichyoucanconfigurethekerneltosuitdifferentjobs,fromasmalldedicateddevicesuchasasmartthermostattoacomplexmobilehandset.Incurrentversions,therearemanythousandsofconfigurationoptions.Gettingtheconfigurationrightisataskinitselfbut,beforethat,Iwanttoshowyouhowitworkssothatyoucanbetterunderstandwhatisgoingon.
TheconfigurationmechanismiscalledKconfig,andthebuildsystemthatitintegrateswithiscalledKbuild.BotharedocumentedinDocumentation/kbuild.Kconfig/Kbuildisusedinanumberofotherprojectsaswellasthekernel,includingCrosstool-NG,U-Boot,Barebox,andBusyBox.
TheconfigurationoptionsaredeclaredinahierarchyoffilesnamedKconfig,usingasyntaxdescribedinDocumentation/kbuild/kconfig-language.txt.InLinux,thetoplevelKconfiglookslikethis:
mainmenu"Linux/$ARCH$KERNELVERSIONKernelConfiguration"
configSRCARCH
string
optionenv="SRCARCH"
source"arch/$SRCARCH/Kconfig"
Thelastlineincludesthearchitecture-dependentconfigurationfilewhichsourcesotherKconfigfiles,dependingonwhichoptionsareenabled.Havingthearchitectureplaysucharolehastwoimplications:firstly,thatyoumustspecifyanarchitecturewhenconfiguringLinuxbysettingARCH=[architecture],otherwiseitwilldefaulttothelocalmachinearchitecture,andsecond,thatthelayoutofthetoplevelmenuisdifferentforeacharchitecture.
ThevalueyouputintoARCHisoneofthesubdirectoriesyoufindindirectoryarch,withtheodditythatARCH=i386andARCH=x86_64bothsourcearch/x86/Kconfig.
TheKconfigfilesconsistlargelyofmenus,delineatedbymenuandendmenu
keywords.Menuitemsaremarkedbythekeywordconfig.Hereisanexample,takenfromdrivers/char/Kconfig:
menu"Characterdevices"
[...]
configDEVMEM
bool"/dev/memvirtualdevicesupport"
defaulty
help
SayYhereifyouwanttosupportthe/dev/memdevice.
The/dev/memdeviceisusedtoaccessareasofphysical
memory.
Whenindoubt,say"Y".
[...]
endmenu
Theparameterfollowingconfignamesavariablethat,inthiscase,isDEVMEM.Sincethisoptionisabool(Boolean),itcanonlyhavetwovalues:ifitisenabled,itisassignedtoy,ifitisnotenabled,thevariableisnotdefinedatall.Thenameofthemenuitemthatisdisplayedonthescreenisthestringfollowingtheboolkeyword.
Thisconfigurationitem,alongwithalltheothers,isstoredinafilenamed.config(notethattheleadingdot(.)meansthatitisahiddenfilethatwillnotbeshownbythelscommand,unlessyoutypels-atoshowallthefiles).Thelinecorrespondingtothisconfigurationitemreads:
CONFIG_DEVMEM=y
Thereareseveralotherdatatypesinadditiontobool.Hereisthelist:
bool:Eitheryornotdefined.tristate:Usedwhereafeaturecanbebuiltasakernelmoduleorbuiltintothemainkernelimage.Thevaluesaremforamodule,ytobebuiltin,andnotdefinedifthefeatureisnotenabled.int:Anintegervalueusingdecimalnotation.hex:Anunsignedintegervalueusinghexadecimalnotation.string:Astringvalue.
Theremaybedependenciesbetweenitems,expressedbythedependsonconstruct,asshownhere:
configMTD_CMDLINE_PARTS
tristate"Commandlinepartitiontableparsing"
dependsonMTD
IfCONFIG_MTDhasnotbeenenabledelsewhere,thismenuoptionisnotshownandsocannotbeselected.
Therearealsoreversedependencies;theselectkeywordenablesotheroptionsifthisoneisenabled.TheKconfigfileinarch/$ARCHhasalargenumberofselectstatementsthatenablefeaturesspecifictothearchitecture,ascanbeseenhereforARM:
configARM
bool
defaulty
selectARCH_CLOCKSOURCE_DATA
selectARCH_HAS_DEVMEM_IS_ALLOWED
[...]
ThereareseveralconfigurationutilitiesthatcanreadtheKconfigfilesandproducea.configfile.Someofthemdisplaythemenusonscreenandallowyoutomakechoicesinteractively.menuconfigisprobablytheonemostpeoplearefamiliarwith,buttherearealsoxconfigandgconfig.
Youlauncheachoneviathemakecommand,rememberingthat,inthecaseofthekernel,youhavetosupplyanarchitecture,asillustratedhere:
$makeARCH=armmenuconfig
Here,youcanseemenuconfigwiththeDEVMEMconfigoptionhighlightedinthepreviousparagraph:
Thestar(*)totheleftofanitemmeansthatitisselected(Y)or,ifitisanM,
thatithasbeenselectedtobebuiltasakernelmodule.
YouoftenseeinstructionslikeenableCONFIG_BLK_DEV_INITRD,butwithsomanymenustobrowsethrough,itcantakeawhiletofindtheplacewherethatconfigurationisset.Allconfigurationeditorshaveasearchfunction.Youcanaccessitinmenuconfigbypressingtheforwardslashkey,/.Inxconfig,itisintheeditmenu,butmakesureyoumissoffCONFIG_partoftheconfigurationitemyouaresearchingfor.
Withsomanythingstoconfigure,itisunreasonabletostartwithacleansheeteachtimeyouwanttobuildakernel,sothereareasetofknownworkingconfigurationfilesinarch/$ARCH/configs,eachcontainingsuitableconfigurationvaluesforasingleSoCoragroupofSoCs.
Youcanselectonewiththemake[configurationfilename]command.Forexample,toconfigureLinuxtorunonawiderangeofSoCsusingtheARMv7-Aarchitecture,youwouldtype:
$makeARCH=armmulti_v7_defconfig
Thisisagenerickernelthatrunsonvariousdifferentboards.Foramorespecializedapplication,forexample,whenusingavendor-suppliedkernel,thedefaultconfigurationfileispartoftheboardsupportpackage;youwillneedtofindoutwhichonetousebeforeyoucanbuildthekernel.
Thereisanotherusefulconfigurationtargetnamedoldconfig.Thistakesanexisting.configfileandasksyoutosupplyconfigurationvaluesforanyoptionsthatdon'thavethem.Youwoulduseitwhenmovingaconfigurationtoanewerkernelversion;copy.configfromtheoldkerneltothenewsourcedirectoryandrunthemakeARCH=armoldconfigcommandtobringituptodate.Itcanalsobeusedtovalidatea.configfilethatyouhaveeditedmanually(ignoringthetext""Automaticallygeneratedfile;DONOTEDIT""thatoccursatthetop;sometimesitisOKtoignorewarnings).
Ifyoudomakechangestotheconfiguration,themodified.configfilebecomespartofyourboardsupportpackageandneedstobeplacedundersourcecodecontrol.
Whenyoustartthekernelbuild,aheaderfile,include/generated/autoconf.h,isgenerated,whichcontains#defineforeachconfigurationvaluesothatitcanbeincludedinthekernelsource.
UsingLOCALVERSIONtoidentifyyourkernelYoucandiscoverthekernelversionthatyouhavebuiltusingthemakekernelversiontarget:
$makeARCH=armkernelversion
4.9.13
Thisisreportedatruntimethroughtheunamecommand,andisalsousedinnamingthedirectorywherekernelmodulesarestored.
Ifyouchangetheconfigurationfromthedefault,itisadvisabletoappendyourownversioninformation,whichyoucanconfigurebysettingCONFIG_LOCALVERSION.Asanexample,ifIwantedtomarkthekernelIambuildingwiththeidentifiermelpandversion1.0,Iwoulddefinethelocalversioninmenuconfiglikethis:
Runningmakekernelversionproducesthesameoutputasbefore,butnowifIrunmakekernelrelease,Isee:
$makeARCH=armkernelrelease
4.9.13-melp-v1.0
KernelmodulesIhavementionedkernelmodulesseveraltimesalready.DesktopLinuxdistributionsusethemextensivelysothatthecorrectdeviceandkernelfunctionscanbeloadedatruntime,dependingonthehardwaredetectedandfeaturesrequired.Withoutthem,everysingledriverandfeaturewouldhavetobestaticallylinkedintothekernel,makingitinfeasiblylarge.
Ontheotherhand,withembeddeddevices,thehardwareandkernelconfigurationisusuallyknownatthetimethekernelisbuilt,andthereforemodulesarenotsouseful.Infact,theycauseaproblembecausetheycreateaversiondependencybetweenthekernelandtherootfilesystem,whichcancausebootfailuresifoneisupdatedbutnottheother.Consequently,itisquitecommonforembeddedkernelstobebuiltwithoutanymodulesatall.
Hereareafewcaseswherekernelmodulesareagoodideainembeddedsystems:
Whenyouhaveproprietarymodules,forthelicensingreasonsgivenintheprecedingsection.Toreduceboottimebydeferringtheloadingofnon-essentialdrivers.Whenthereareanumberofdriversthatcouldbeloadedanditwouldtakeuptoomuchmemorytocompilethemstatically.Forexample,youhaveaUSBinterfacethatsupportsarangeofdevices.Thisisessentiallythesameargumentasisusedindesktopdistributions.
Compiling–KbuildThekernelbuildsystem,Kbuild,isasetofmakescriptsthattaketheconfigurationinformationfromthe.configfile,workoutthedependencies,andcompileeverythingthatisnecessarytoproduceakernelimagecontainingallthestaticallylinkedcomponents,possiblyadevicetreebinaryandpossiblyoneormorekernelmodules.Thedependenciesareexpressedinmakefilesthatareineachdirectorywithbuildablecomponents.Forinstance,thefollowingtwolinesaretakenfromdrivers/char/Makefile:
obj-y+=mem.orandom.o
obj-$(CONFIG_TTY_PRINTK)+=ttyprintk.o
Theobj-yruleunconditionallycompilesafiletoproducethetarget,somem.candrandom.carealwayspartofthekernel.Inthesecondline,ttyprintk.cisdependentonaconfigurationparameter.IfCONFIG_TTY_PRINTKisy,itiscompiledasabuilt-in;ifitism,itisbuiltasamodule;andiftheparameterisundefined,itisnotcompiledatall.
Formosttargets,justtypingmake(withtheappropriateARCHandCROSS_COMPILE)willdothejob,butitisinstructivetotakeitonestepatatime.
FindingoutwhichkerneltargettobuildTobuildakernelimage,youneedtoknowwhatyourbootloaderexpects.Thisisaroughguide:
U-Boot:Traditionally,U-BoothasrequireduImage,butnewerversionscanloadazImagefileusingthebootzcommandx86targets:RequiresabzImagefileMostotherbootloaders:RequireazImagefile
HereisanexampleofbuildingazImagefile:
$make-j4ARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-zImage
The-j4optiontellsmakehowmanyjobstoruninparallel,whichreducesthetimetakentobuild.AroughguideistorunasmanyjobsasyouhaveCPUcores.
ThereisasmallissuewithbuildingauImagefileforARMwithmulti-platformsupport,whichisthenormforthecurrentgenerationofARMSoCkernels.Multi-platformsupportforARMwasintroducedinLinux3.7.ItallowsasinglekernelbinarytorunonmultipleplatformsandisastepontheroadtowardhavingasmallnumberofkernelsforallARMdevices.Thekernelselectsthecorrectplatformbyreadingthemachinenumberorthedevicetreepassedtoitbythebootloader.Theproblemoccursbecausethelocationofphysicalmemorymightbedifferentforeachplatform,andsotherelocationaddressforthekernel(usually0x8000bytesfromthestartofphysicalRAM)mightalsobedifferent.TherelocationaddressiscodedintotheuImageheaderbythemkimagecommandwhenthekernelisbuilt,butitwillfailifthereismorethanonerelocationaddresstochoosefrom.Toputitanotherway,theuImageformatisnotcompatiblewithmulti-platformimages.YoucanstillcreateauImagebinaryfromamulti-platformbuild,solongasyougivetheLOADADDRoftheparticularSoCyouarehopingtobootthiskernelon.Youcanfindtheloadaddressbylookinginmach-[yourSoC]/Makefile.bootandnotingthevalueofzreladdr-y:
$make-j4ARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-
LOADADDR=0x80008000uImage
BuildartifactsAkernelbuildgeneratestwofilesinthetopleveldirectory:vmlinuxandSystem.map.Thefirst,vmlinux,isthekernelasanELFbinary.Ifyouhavecompiledyourkernelwithdebugenabled(CONFIG_DEBUG_INFO=y),itwillcontaindebugsymbolswhichcanbeusedwithdebuggerslikekgdb.YoucanalsouseotherELFbinarytools,suchassize:
$arm-cortex_a8-linux-gnueabihf-sizevmlinux
textdatabssdechexfilename
10605896529174835186416249508f7f2a4vmlinux
System.mapcontainsthesymboltableinahumanreadableform.
MostbootloaderscannothandleELFcodedirectly.Thereisafurtherstageofprocessingwhichtakesvmlinuxandplacesthosebinariesinarch/$ARCH/bootthataresuitableforthevariousbootloaders:
Image:vmlinuxconvertedtorawbinaryformat.zImage:ForthePowerPCarchitecture,thisisjustacompressedversionofImage,whichimpliesthatthebootloadermustdothedecompression.Forallotherarchitectures,thecompressedImageispiggybackedontoastubofcodethatdecompressesandrelocatesit.uImage:zImageplusa64-byteU-Bootheader.
Whilethebuildisrunning,youwillseeasummaryofthecommandsbeingexecuted:
$make-j4ARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-\
zImage
CHKinclude/config/kernel.release
CHKinclude/generated/uapi/linux/version.h
HOSTCCscripts/basic/fixdep
HOSTCCscripts/kallsyms
HOSTCCscripts/dtc/dtc.o
[...]
Sometimes,whenthekernelbuildfails,itisusefultoseetheactualcommandsbeingexecuted.Todothat,addV=1tothecommandline:
$make-j4ARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-\
V=1zImage
[...]
arm-cortex_a8-linux-gnueabihf-gcc-Wp,-MD,arch/arm/kernel/.irq.o.d-nostdinc-isystem/home/chris/x-tools/arm-cortex_a8-linux-gnueabihf/lib/gcc/arm-cortex_a8-linux-gnueabihf/5.2.0/include-I./arch/arm/include-I./arch/arm/include/generated/uapi-I./arch/arm/include/generated-I./include-I./arch/arm/include/uapi-I./include/uapi-I./include/generated/uapi-include./include/linux/kconfig.h-D__KERNEL__-mlittle-endian-Wall-Wundef-Wstrict-prototypes-Wno-trigraphs-fno-strict-aliasing-fno-common-Werror-implicit-function-declaration-Wno-format-security-std=gnu89-fno-PIE-fno-dwarf2-cfi-asm-fno-ipa-sra-mabi=aapcs-linux-mno-thumb-interwork-mfpu=vfp-funwind-tables-marm-D__LINUX_ARM_ARCH__=7-march=armv7-a-msoft-float-Uarm-fno-delete-null-pointer-checks-O2--param=allow-store-data-races=0-Wframe-larger-than=1024-fno-stack-protector-Wno-unused-but-set-variable-fomit-frame-pointer-fno-var-tracking-assignments-Wdeclaration-after-statement-Wno-pointer-sign-fno-strict-overflow-fconserve-stack-Werror=implicit-int-Werror=strict-prototypes-Werror=date-time-Werror=incompatible-pointer-types-DCC_HAVE_ASM_GOTO-DKBUILD_BASENAME='"irq"'-DKBUILD_MODNAME='"irq"'-c-oarch/arm/kernel/irq.oarch/arm/kernel/irq.c
[...]
CompilingdevicetreesThenextstepistobuildthedevicetree,ortreesifyouhaveamulti-platformbuild.Thedtbstargetbuildsdevicetreesaccordingtotherulesinarch/$ARCH/boot/dts/Makefile,usingthedevicetreesourcefilesinthatdirectory.Followingisasnippetfrombuildingthedtbstargetformulti_v7_defconfig:
$makeARCH=armdtbs
[...]
DTCarch/arm/boot/dts/alpine-db.dtb
DTCarch/arm/boot/dts/artpec6-devboard.dtb
DTCarch/arm/boot/dts/at91-kizbox2.dtb
DTCarch/arm/boot/dts/at91-sama5d2_xplained.dtb
DTCarch/arm/boot/dts/at91-sama5d3_xplained.dtb
DTCarch/arm/boot/dts/sama5d31ek.dtb
[...]
Thecompiled.dtbfilesaregeneratedinthesamedirectoryasthesources.
CompilingmodulesIfyouhaveconfiguredsomefeaturestobebuiltasmodules,youcanbuildthemseparatelyusingthemodulestarget:
$make-j4ARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-\
modules
Thecompiledmoduleshavea.kosuffixandaregeneratedinthesamedirectoryasthesourcecode,meaningthattheyarescatteredallaroundthekernelsourcetree.Findingthemisalittletricky,butyoucanusethemodules_installmaketargettoinstallthemintherightplace.Thedefaultlocationis/lib/modulesinyourdevelopmentsystem,whichisalmostcertainlynotwhatyouwant.Toinstallthemintothestagingareaofyourrootfilesystem(wewilltalkaboutrootfilesystemsinthenextchapter),providethepathusingINSTALL_MOD_PATH:
$make-j4ARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-\
INSTALL_MOD_PATH=$HOME/rootfsmodules_install
Kernelmodulesareputintothedirectory/lib/modules/[kernelversion],relativetotherootofthefilesystem.
CleaningkernelsourcesTherearethreemaketargetsforcleaningthekernelsourcetree:
clean:Removesobjectfilesandmostintermediates.mrproper:Removesallintermediatefiles,includingthe.configfile.Usethistargettoreturnthesourcetreetothestateitwasinimmediatelyaftercloningorextractingthesourcecode.Ifyouarecuriousaboutthename,MrProperisacleaningproductcommoninsomepartsoftheworld.Themeaningofmakemrproperistogivethekernelsourcesareallygoodscrub.distclean:Thisisthesameasmrproper,butalsodeleteseditorbackupfiles,patchfiles,andotherartifactsofsoftwaredevelopment.
BuildingakernelfortheBeagleBoneBlackInlightoftheinformationalreadygiven,hereisthecompletesequenceofcommandstobuildakernel,themodules,andadevicetreefortheBeagleBoneBlack,usingtheCrosstool-NGARMCortexA8crosscompiler:
$cdlinux-stable
$makeARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-mrproper
$makeARCH=armmulti_v7_defconfig
$make-j4ARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-zImage
$make-j4ARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-modules
$makeARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-dtbs
ThesecommandsareinthescriptMELP/chapter_04/build-linux-bbb.sh.
BuildingakernelforQEMUHereisthesequenceofcommandstobuildLinuxfortheARMVersatilePBthatisemulatedbyQEMU,usingtheCrosstool-NGV5tecompiler:
$cdlinux-stable
$makeARCH=armCROSS_COMPILE=arm-unknown-linux-gnueabi-mrproper
$make-j4ARCH=armCROSS_COMPILE=arm-unknown-linux-gnueabi-zImage
$make-j4ARCH=armCROSS_COMPILE=arm-unknown-linux-gnueabi-modules
$makeARCH=armCROSS_COMPILE=arm-unknown-linux-gnueabi-dtbs
ThesecommandsareinthescriptMELP/chapter_04/build-linux-versatilepb.sh.
BootingthekernelBootingLinuxishighlydevice-dependent.Inthissection,IwillshowyouhowitworksfortheBeagleBoneBlackandQEMU.Forothertargetboards,youmustconsulttheinformationfromthevendororfromthecommunityproject,ifthereisone.
Atthispoint,youshouldhavethezImagefileandthedtbstargetsfortheBeagleBoneBlackorQEMU.
BootingtheBeagleBoneBlackTobegin,youneedamicroSDcardwithU-Bootinstalled,asdescribedinthesectionInstallingU-Boot.PlugthemicroSDcardintoyourcardreaderandfromthelinux-stabledirectorythefilesarch/arm/boot/zImageandarch/arm/boot/dts/am335x-boneblack.dtbtothebootpartition.UnmountthecardandplugitintotheBeagleBoneBlack.Startaterminalemulator,suchasgtkterm,andbepreparedtopressthespacebarassoonasyouseetheU-Bootmessagesappear.Next,powerontheBeagleBoneBlackandpressthespacebar.YoushouldgetaU-Bootprompt,.NowenterthefollowingcommandstoloadLinuxandthedevicetreebinary:
U-Boot#fatloadmmc0:10x80200000zImage
readingzImage
7062472bytesreadin447ms(15.1MiB/s)
U-Boot#fatloadmmc0:10x80f00000am335x-boneblack.dtb
readingam335x-boneblack.dtb
34184bytesreadin10ms(3.3MiB/s)
U-Boot#setenvbootargsconsole=ttyO0
U-Boot#bootz0x80200000-0x80f00000
##FlattenedDeviceTreeblobat80f00000
Bootingusingthefdtblobat0x80f00000
LoadingDeviceTreeto8fff4000,end8ffff587...OK
Startingkernel...
[0.000000]BootingLinuxonphysicalCPU0x0
[...]
Notethatwesetthekernelcommandlinetoconsole=ttyO0.ThattellsLinuxwhichdevicetouseforconsoleoutput,whichinthiscaseisthefirstUARTontheboard,devicettyO0.Withoutthis,wewouldnotseeanymessagesafterStartingthekernel...,andthereforewouldnotknowifitwasworkingornot.Thesequencewillendinakernelpanic,forreasonsIwillexplainlateron.
BootingQEMUAssumingthatyouhavealreadyinstalledqemu-system-arm,youcanlaunchwiththekernelandthe.dtbfilefortheARMVersatilePB,asfollows:
$QEMU_AUDIO_DRV=none\
qemu-system-arm-m256M-nographic-Mversatilepb-kernelzImage\
-append"console=ttyAMA0,115200"-dtbversatile-pb.dtb
NotethatsettingQEMU_AUDIO_DRVtononeisjusttosuppresserrormessagesfromQEMUaboutmissingconfigurationsfortheaudiodrivers,whichwedonotuse.AswiththeBeagleBoneBlack,thiswillendwithakernelpanicandthesystemwillhalt.ToexitfromQEMU,typeCtrl+Aandthenx(twoseparatekeystrokes).
KernelpanicWhilethingsstartedoffwell,theyendedbadly:
[1.886379]Kernelpanic-notsyncing:VFS:Unabletomountrootfsonunknown-block(0,0)
[1.895105]---[endKernelpanic-notsyncing:VFS:Unabletomountrootfsonunknown-block(0,0)
Thisisagoodexampleofakernelpanic.Apanicoccurswhenthekernelencountersanunrecoverableerror.Bydefault,itwillprintoutamessagetotheconsoleandthenhalt.Youcansetthepaniccommand-lineparametertoallowafewsecondsbeforerebootsfollowingapanic.Inthiscase,theunrecoverableerrorisnorootfilesystem,illustratingthatakernelisuselesswithoutauserspacetocontrolit.Youcansupplyauserspacebyprovidingarootfilesystem,eitherasaramdiskoronamountablemassstoragedevice.Wewilltalkabouthowtocreatearootfilesysteminthenextchapter,butfirstIwanttodescribethesequenceofeventsthatleadsuptopanic.
EarlyuserspaceInordertotransitionfromkernelinitializationtouserspace,thekernelhastomountarootfilesystemandexecuteaprograminthatrootfilesystem.Thiscanbeachievedviaaramdiskorbymountingarealfilesystemonablockdevice.Thecodeforallofthisisininit/main.c,startingwiththefunctionrest_init(),whichcreatesthefirstthreadwithPID1andrunsthecodeinkernel_init().Ifthereisaramdisk,itwilltrytoexecutetheprogram/init,whichwilltakeonthetaskofsettinguptheuserspace.
Iffailstofindandrun/init,ittriestomountafilesystembycallingthefunctionprepare_namespace()ininit/do_mounts.c.Thisrequiresaroot=commandlinetogivethenameoftheblockdevicetouseformounting,usuallyintheform:
root=/dev/<diskname><partitionnumber>
Or,forSDcardsandeMMC:
root=/dev/<diskname>p<partitionnumber>
Forexample,forthefirstpartitiononanSDcard,thatwouldberoot=/dev/mmcblk0p1.Ifthemountsucceeds,itwilltrytoexecute/sbin/init,followedby/etc/init,/bin/init,andthen/bin/sh,stoppingatthefirstonethatworks.
Theprogramcanbeoverriddenonthecommandline.Foraramdisk,userdinit=,andforafilesystem,useinit=.
KernelmessagesKerneldevelopersarefondofprintingoutusefulinformationthroughliberaluseofprintk()andsimilarfunctions.Themessagesarecategorizedaccordingtoimportance,with0beingthehighest:
Level Value MeaningKERN_EMERG 0 ThesystemisunusableKERN_ALERT 1 ActionmustbetakenimmediatelyKERN_CRIT 2 CriticalconditionsKERN_ERR 3 ErrorconditionsKERN_WARNING 4 WarningconditionsKERN_NOTICE 5 NormalbutsignificantconditionsKERN_INFO 6 InformationalKERN_DEBUG 7 Debug-levelmessages
Theyarefirstwrittentoabuffer,__log_buf,thesizeofwhichistwotothepowerofCONFIG_LOG_BUF_SHIFT.Forexample,ifCONFIG_LOG_BUF_SHIFTis16,then__log_bufis64KiB.Youcandumptheentirebufferusingthecommanddmesg.
Ifthelevelofamessageislessthantheconsoleloglevel,itisdisplayedontheconsoleaswellasplacedin__log_buf.Thedefaultconsoleloglevelis7,meaningthatmessagesoflevel6andloweraredisplayed,filteringoutKERN_DEBUG,whichislevel7.Youcanchangetheconsoleloglevelinseveralways,includingbyusingthekernelparameterloglevel=<level>,orthecommanddmesg-n<level>.
KernelcommandlineThekernelcommandlineisastringthatispassedtothekernelbythebootloader,viathebootargsvariableinthecaseofU-Boot;itcanalsobedefinedinthedevicetree,orsetaspartofthekernelconfigurationinCONFIG_CMDLINE.
Wehaveseensomeexamplesofthekernelcommandlinealready,buttherearemanymore.ThereisacompletelistinDocumentation/kernel-parameters.txt.Hereisasmallerlistofthemostusefulones:
Name Description
debugSetstheconsolelogleveltothehighestlevel,8,toensurethatyouseeallthekernelmessagesontheconsole.
init=Theinitprogramtorunfromamountedrootfilesystem,whichdefaultsto/sbin/init.
lpj=Setsloops_per_jiffytoagivenconstant.Thereisadescriptionofthesignificanceofthisintheparagraphfollowingthistable.
panic=
Behaviorwhenthekernelpanics:ifitisgreaterthanzero,itgivesthenumberofsecondsbeforerebooting;ifitiszero,itwaitsforever(thisisthedefault);orifitislessthanzero,itrebootswithoutanydelay.
quiet
Setstheconsoleloglevelto,suppressingallbutemergencymessages.Sincemostdeviceshaveaserialconsole,ittakestimetooutputallthosestrings.Consequently,reducingthenumberofmessagesusingthisoptionreducesboottime.
rdinit= Theinitprogramtorunfromaramdisk.Itdefaultsto/init.
roMountstherootdeviceasread-only.Hasnoeffectonaramdisk,whichisalwaysread/write.
root= Devicetomounttherootfilesystem.
rootdelay=
Thenumberofsecondstowaitbeforetryingtomounttherootdevice;defaultstozero.Usefulifthedevicetakestimetoprobethehardware,butalsoseerootwait.
rootfstype=
Thefilesystemtypefortherootdevice.Inmanycases,itisauto-detectedduringmount,butitisrequiredforjffs2filesystems.
rootwaitWaitsindefinitelyfortherootdevicetobedetected.Usuallynecessarywithmmcdevices.
rw Mountstherootdeviceasread-write(default).
Thelpjparameterisoftenmentionedinconnectionwithreducingthekernelboottime.Duringinitialization,thekernelloopsforapproximately250mstocalibrateadelayloop.Thevalueisstoredinthevariableloops_per_jiffy,andreportedlikethis:
Calibratingdelayloop...996.14BogoMIPS(lpj=4980736)
Ifthekernelalwaysrunsonthesamehardware,itwillalwayscalculatethesamevalue.Youcanshave250msofftheboottimebyaddinglpj=4980736tothecommandline.
PortingLinuxtoanewboardPortingLinuxtoanewboardcanbeeasyordifficult,dependingonhowsimilaryourboardistoanexistingdevelopmentboard.InChapter3,AllAboutBootloaders,weportedU-Boottoanewboard,namedNova,whichisbasedontheBeagleBoneBlack.veryfewchangestobemadetothekernelcodeandsoitveryeasy.Ifyouareportingtocompletelynewandinnovativehardware,therewillbemoretodo.Iamonlygoingtoconsiderthesimplecase.
Theorganizationofarchitecture-specificcodeinarch/$ARCHdiffersfromonesystemtoanother.Thex86architectureisprettycleanbecausemosthardwaredetailsaredetectedatruntime.ThePowerPCarchitectureputsSoCandboard-specificfilesintosubdirectoryplatforms.TheARMarchitecture,ontheotherhand,isquitemessy,inpartbecausethereisalotofvariabilitybetweenthemanyARM-basedSoCs.Platform-dependentcodeisputindirectoriesnamedmach-*,approximatelyoneperSoC.Thereareotherdirectoriesnamedplat-*whichcontaincodecommontoseveralversionsofanSoC.InthecaseoftheBeagleBoneBlack,therelevantdirectoryisarch/arm/mach-omap2.Don'tbefooledbythenamethough;itcontainssupportforOMAP2,3,and4chips,aswellastheAM33xxfamilyofchipsthattheBeagleBoneuses.
Inthefollowingsections,IamgoingtoexplainhowtocreateadevicetreeforanewboardandhowtokeythatintotheinitializationcodeofLinux.
AnewdevicetreeThefirstthingtodoiscreateadevicetreefortheboardandmodifyittodescribetheadditionalorchangedhardwareoftheNovaboard.Inthissimplecase,wewilljustcopyam335x-boneblack.dtstonova.dtsandchangetheboardnameinnova.dts,asshownhighlightedhere:
/dts-v1/;
#include"am33xx.dtsi"
#include"am335x-bone-common.dtsi"
#include<dt-bindings/display/tda998x.h>
/{
model="Nova";
compatible="ti,am335x-bone-black","ti,am335x-bone","ti,am33xx";
};
[...]
WecanbuildtheNovadevicetreebinaryexplicitlylikethis:
$makeARCH=armnova.dtb
IfwewantthedevicetreeforNovatobecompiledbymakeARCH=armdtbswheneveranAM33xxtargetisselected,wecouldaddadependencyinarch/arm/boot/dts/Makefileasfollows:
[...]
dtb-$(CONFIG_SOC_AM33XX)+=
nova.dtb
[...
WecanseetheeffectofusingtheNovadevicetreebybootingtheBeagleBoneBlack,followingthesameprocedureasinthesectionBootingtheBeagleBoneBlack,withthesamezImagefileasbefore,butloadingnova.dtbinplaceofam335x-boneblack.dtb.Thefollowinghighlightedoutputisthepointatwhichthemachinemodelisprintedout:
Startingkernel...
[0.000000]BootingLinuxonphysicalCPU0x0
[0.000000]Linuxversion4.9.13-melp-v1.0-dirty(chris@chris-xps)(gccversion5.2.0(crosstool-NGcrosstool-ng-1.22.0))#2SMPFriMar2417:51:41GMT2017
[0.000000]CPU:ARMv7Processor[413fc082]revision2(ARMv7),cr=10c5387d
[0.000000]CPU:PIPT/VIPTnonaliasingdatacache,VIPTaliasinginstructioncache
[0.000000]OF:fdt:Machinemodel:Nova
[...]
NowthatwehaveadevicetreespecificallyfortheNovaboard,wecouldmodifyittodescribethehardwaredifferencesbetweenNovaandtheBeagleBoneBlack.Therearequitelikelytobechangestothekernelconfigurationaswell,inwhichcaseyouwouldcreateacustomconfigurationfilebasedonacopyofarch/arm/configs/multi_v7_defconfig.
SettingtheboardcompatiblepropertyCreatinganewdevicetreemeansthatwecandescribethehardwareontheNovaboard,selectingdevicedriversandsettingpropertiestomatch.But,supposetheNovaboardneedsdifferentearlyinitializationcodethantheBeagleBoneBlack;howcanwelinkthatin?
Theboardsetupiscontrolledbythecompatiblepropertyintherootnode.ThisiswhatwehavefortheNovaboardatthemoment:
/{
model="Nova";
compatible="ti,am335x-bone-black","ti,am335x-bone","ti,am33xx";
};
Whenthekernelparsesthisnode,itwillsearchforamatchingmachineforeachofthevaluesofthecompatibleproperty,startingontheleftandstoppingwiththefirstmatchfound.EachmachineisdefinedinastructuredelimitedbyDT_MACHINE_STARTandMACHINE_ENDmacros.Inarch/arm/mach-omap2/board-generic.c,wefind:
#ifdefCONFIG_SOC_AM33XX
staticconstchar*constam33xx_boards_compat[]__initconst={
"ti,am33xx",
NULL,
};
DT_MACHINE_START(AM33XX_DT,"GenericAM33XX(FlattenedDeviceTree)")
.reserve=omap_reserve,
.map_io=am33xx_map_io,
.init_early=am33xx_init_early,
.init_machine=omap_generic_init,
.init_late=am33xx_init_late,
.init_time=omap3_gptimer_timer_init,
.dt_compat=am33xx_boards_compat,
.restart=am33xx_restart,
MACHINE_END
#endif
Notethatthestringarray,am33xx_boards_compat,contains"ti,am33xx"whichmatchesoneofthemachineslistedinthecompatibleproperty.Infact,itistheonlymatchpossible,sincetherearenoneforti,am335x-bone-blackorti,am335x-bone.The
structurebetweenDT_MACHINE_STARTandMACHINE_ENDcontainsapointertothestringarray,andfunctionpointersfortheboardsetupfunctions.Youmaywonderwhybotherwithti,am335x-bone-blackandti,am335x-boneiftheynevermatchanything?Theanswerispartlythattheyareplaceholdersforthefuture,butalsothatthereareplacesinthekernelthatcontainruntimetestsforthemachineusingthefunctionof_machine_is_compatible().Forexample,indrivers/net/ethernet/ti/cpsw-common.c:
intti_cm_get_macid(structdevice*dev,intslave,u8*mac_addr)
{
[...]
if(of_machine_is_compatible("ti,am33xx"))
returncpsw_am33xx_cm_get_macid(dev,0x630,slave,mac_addr);
[...]
Thus,wehavetolookthroughnotjustthemach-*directoriesbuttheentirekernelsourcecodetogetalistofalltheplacesthatdependonthemachinecompatibleproperty.Inthe4.9kernel,youwillfindthattherearestillnochecksforti,am335x-bone-blackandti,am335x-bone,buttheremaybeinthefuture.
ReturningtotheNovaboard,ifwewanttoaddmachinespecificsetup,wecanaddamachineinarch/arm/mach-omap2/board-generic.c,likethis:
#ifdefCONFIG_SOC_AM33XX
[...]
staticconstchar*constnova_compat[]__initconst={
"ti,nova",
NULL,
};
DT_MACHINE_START(NOVA_DT,"Novaboard(FlattenedDeviceTree)")
.reserve=omap_reserve,
.map_io=am33xx_map_io,
.init_early=am33xx_init_early,
.init_machine=omap_generic_init,
.init_late=am33xx_init_late,
.init_time=omap3_gptimer_timer_init,
.dt_compat=nova_compat,
.restart=am33xx_restart,
MACHINE_END
#endif
Thenwecouldchangethedevicetreerootnodelikethis:
/{
model="Nova";
compatible="ti,nova","ti,am33xx";
};
Now,themachinewillmatchti,novainboard-generic.c.Wekeepti,am33xxbecause
wewanttheruntimetests,suchastheoneindrivers/net/ethernet/ti/cpsw-common.c,tocontinuetowork.
AdditionalreadingThefollowingresourceshavefurtherinformationaboutthetopicsintroducedinthischapter:
LinuxKernelDevelopment,3rdEditionbyRobertLoveLinuxweeklynews,https://lwn.net/
SummaryLinuxisaverypowerfulandcomplexoperatingsystemkernelthatcanbemarriedtovarioustypesofuserspace,rangingfromasimpleembeddeddevice,throughincreasinglycomplexmobiledevicesusingAndroid,toafullserveroperatingsystem.Oneofitsstrengthsisthedegreeofconfigurability.Thedefinitiveplacetogetthesourcecodeishttps://www.kernel.org/,butyouwillprobablyneedtogetthesourceforaparticularSoCorboardfromthevendorofthatdeviceorathird-partythatsupportsthatdevice.Thecustomizationofthekernelforaparticulartargetmayconsistofchangestothecorekernelcode,additionaldriversfordevicesthatarenotinmainlineLinux,adefaultkernelconfigurationfile,andadevicetreesourcefile.
Normally,youstartwiththedefaultconfigurationforyourtargetboard,andthentweakitbyrunningoneoftheconfigurationtoolssuchasmenuconfig.Oneofthethingsyoushouldconsideratthispointiswhetherthekernelfeaturesanddriversshouldbecompiledasmodulesorbuilt-in.Kernelmodulesareusuallynogreatadvantageforembeddedsystems,wherethefeaturesetandhardwareareusuallywelldefined.However,modulesareoftenusedasawaytoimportproprietarycodeintothekernel,andalsotoreduceboottimebyloadingnon-essentialdriversafterboot.
Buildingthekernelproducesacompressedkernelimagefile,namedzImage,bzImage,oruImage,dependingonthebootloaderyouwillbeusingandthetargetarchitecture.Akernelbuildwillalsogenerateanykernelmodules(as.kofiles)thatyouhaveconfigured,anddevicetreebinaries(as.dtbfiles)ifyourtargetrequiresthem.
PortingLinuxtoanewtargetboardcanbequitesimpleorverydifficult,dependingonhowdifferentthehardwareisfromthatinthemainlineorvendorsuppliedkernel.Ifyourhardwareisbasedonawell-knownreferencedesign,thenitmaybejustaquestionofmakingchangestothedevicetreeortotheplatformdata.Youmaywellneedtoadddevicedrivers,whichIdiscussinChapter9,InterfacingwithDeviceDrivers.However,ifthehardwareisradicallydifferenttoareferencedesign,youmayneedadditionalcoresupport,whichis
outsidethescopeofthisbook.
ThekernelisthecoreofaLinux-basedsystem,butitcannotworkbyitself.Itrequiresarootfilesystemthatcontainstheuserspacecomponents.Therootfilesystemcanbearamdiskorafilesystemaccessedviaablockdevice,whichwillbethesubjectofthenextchapter.Aswehaveseen,bootingakernelwithoutarootfilesystemresultsinakernelpanic.
BuildingaRootFilesystemTherootfilesystemisthefourthandthefinalelementofembeddedLinux.Onceyouhavereadthischapter,youwillbeablebuild,boot,andrunasimpleembeddedLinuxsystem.
ThetechniquesIwilldescribeherearebroadlyknownasrollyourownorRYO.BackintheearlierdaysofembeddedLinux,thiswastheonlywaytocreatearootfilesystem.TherearestillsomeusecaseswhereanRYOrootfilesystemisapplicable,forexample,whentheamountofRAMorstorageisverylimited,forquickdemonstrations,orforanycaseinwhichyourrequirementsarenot(easily)coveredbythestandardbuildsystemtools.Nevertheless,thesecasesarequiterare.Letmeemphasizethatthepurposeofthischapteriseducational;itisnotmeanttobearecipeforbuildingeverydayembeddedsystems:usethetoolsdescribedinthenextchapterforthis.
Thefirstobjectiveistocreateaminimalrootfilesystemthatwillgiveusashellprompt.Then,usingthisasabase,wewilladdscriptstostartupotherprogramsandconfigureanetworkinterfaceanduserpermissions.ThereareworkedexamplesforboththeBeagleBoneBlackandQEMUtargets.Knowinghowtobuildtherootfilesystemfromscratchisausefulskill,anditwillhelpyoutounderstandwhatisgoingonwhenwelookatmorecomplexexamplesinlaterchapters.
Inthischapter,wewillcoverthefollowingtopics:
Whatshouldbeintherootfilesystem?Transferringtherootfilesystemtothetarget.Creatingabootinitramfs.Theinitprogram.Configuringuseraccounts.Abetterwayofmanagingdevicenodes.Configuringthenetwork.Creatingfilesystemimageswithdevicetables.MountingtherootfilesystemusingNFS.
Whatshouldbeintherootfilesystem?Thekernelwillgetarootfilesystem,eitheraninitramfs,passedasapointerfromthebootloader,orbymountingtheblockdevicegivenonthekernelcommandlinebytheroot=parameter.Onceithasarootfilesystem,thekernelwillexecutethefirstprogram,bydefaultnamedinit,asdescribedinthesectionEarlyuserspaceinChapter4,ConfiguringandBuildingtheKernel.Then,asfarasthekernelisconcerned,itsjobiscomplete.Itisuptotheinitprogramtobeginstartingotherprogramsandsobringthesystemtolife.
Tomakeaminimalrootfilesystem,youneedthesecomponents:
init:Thisistheprogramthatstartseverythingoff,usuallybyrunningaseriesofscripts.IwilldescribehowinitworksinmuchmoredetailinChapter10,StartingUp–TheinitProgramShell:Youneedashelltogiveyouacommandpromptbut,moreimportantly,alsotoruntheshellscriptscalledbyinitandotherprograms.Daemons:Adaemonisabackgroundprogramthatprovidesaservicetoothers.Goodexamplesarethesystemlogdaemon(syslogd)andthesecureshelldaemon(sshd).Theinitprogrammuststarttheinitialpopulationofdaemonstosupportthemainsystemapplications.Infact,initisitselfadaemon:itisthedaemonthatprovidestheserviceoflaunchingotherdaemons.Sharedlibraries:Mostprogramsarelinkedwithsharedlibraries,andsotheymustbepresentintherootfilesystem.Configurationfiles:Theconfigurationforinitandotherdaemonsisstoredinaseriesoftextfiles,usuallyinthe/etcdirectory.Devicenodes:Thesearethespecialfilesthatgiveaccesstovariousdevicedrivers./procand/sys:Thesetwopseudofilesystemsrepresentkerneldatastructuresasahierarchyofdirectoriesandfiles.Manyprogramsandlibraryfunctionsdependonprocandsys.Kernelmodules:Ifyouhaveconfiguredsomepartsofyourkerneltobe
modules,theyneedtobeinstalledintherootfilesystem,usuallyin/lib/modules/[kernelversion].
Inaddition,therearethedevice-specificapplicationsthatmakethedevicedothejobitisintendedfor,andalsotherun-timedatafilesthattheygenerate.
Insomecases,youcouldcondensemostofthesecomponentsintoasingle,statically-linkedprogram,andstarttheprograminsteadofinit.Forexample,ifyourprogramwasnamed/myprog,youwouldaddthefollowingcommandtothekernelcommandline:init=/myprog.Ihavecomeacrosssuchaconfigurationonlyonce,inasecuresysteminwhichtheforksystemcallhadbeendisabled,thusmakingitimpossibleforanyotherprogramtobestarted.Thedownsideofthisapproachisthatyoucan'tmakeuseofthemanytoolsthatnormallygointoanembeddedsystem;youhavetodoeverythingyourself.
ThedirectorylayoutInterestingly,theLinuxkerneldoesnotcareaboutthelayoutoffilesanddirectoriesbeyondtheexistenceoftheprogramnamedbyinit=orrdinit=,soyouarefreetoputthingswhereveryoulike.Asanexample,comparethefilelayoutofadevicerunningAndroidtothatofadesktopLinuxdistribution:theyarealmostcompletelydifferent.
However,manyprogramsexpectcertainfilestobeincertainplaces,andithelpsusdevelopersifdevicesuseasimilarlayout,Androidaside.ThebasiclayoutofaLinuxsystemisdefinedintheFilesystemHierarchyStandard(FHS),whichisavailableathttp://refspecs.linuxfoundation.org/fhs.shtml.TheFHScoversalltheimplementationsofLinuxoperatingsystemsfromthelargesttothesmallest.Embeddeddevicestendtouseasubsetbasedontheirneeds,butitusuallyincludesthefollowing:
/bin:Programsessentialforallusers/dev:Devicenodesandotherspecialfiles/etc:Systemconfigurationfiles/lib:Essentialsharedlibraries,forexample,thosethatmakeuptheC-library/proc:Theprocfilesystem/sbin:Programsessentialtothesystemadministrator/sys:Thesysfsfilesystem/tmp:Aplacetoputtemporaryorvolatilefiles/usr:Additionalprograms,libraries,andsystemadministratorutilities,inthedirectories/usr/bin,/usr/liband/usr/sbin,respectively/var:Ahierarchyoffilesanddirectoriesthatmaybemodifiedatruntime,forexample,logmessages,someofwhichmustberetainedafterboot
Therearesomesubtledistinctionshere.Thedifferencebetween/binand/sbinissimplythatthelatterneednotbeincludedinthesearchpathfornon-rootusers.UsersofRedHat-deriveddistributionswillbefamiliarwiththis.Thesignificanceof/usristhatitmaybeinaseparatepartitionfromtherootfilesystem,soitcannotcontainanythingthatisneededtobootthesystemup.
ThestagingdirectoryYoushouldbeginbycreatingastagingdirectoryonyourhostcomputerwhereyoucanassemblethefilesthatwilleventuallybetransferredtothetarget.Inthefollowingexamples,Ihaveused~/rootfs.Youneedtocreateaskeletondirectorystructureinit,forexample,takealookhere:
$mkdir~/rootfs
$cd~/rootfs
$mkdirbindevetchomelibprocsbinsystmpusrvar
$mkdirusr/binusr/libusr/sbin
$mkdir-pvar/log
Toseethedirectoryhierarchymoreclearly,youcanusethehandytreecommandusedinthefollowingexamplewiththe-doptiontoshowonlythedirectories:
$tree-d
.
├──bin
├──dev
├──etc
├──home
├──lib
├──proc
├──sbin
├──sys
├──tmp
├──usr
│├──bin
│├──lib
│└──sbin
├──va
└──var
└──log
POSIXfileaccesspermissionsEveryprocess,whichinthecontextofthisdiscussionmeanseveryrunningprogram,belongstoauserandoneormoregroups.Theuserisrepresentedbya32-bitnumbercalledtheuserIDorUID.Informationaboutusers,includingthemappingfromaUIDtoaname,iskeptin/etc/passwd.Likewise,groupsarerepresentedbyagroupIDorGIDwithinformationkeptin/etc/group.ThereisalwaysarootuserwithaUIDof0andarootgroupwithaGIDof0.Therootuserisalsocalledthesuperuserbecause;inadefaultconfiguration,itbypassesmostpermissionchecksandcanaccessalltheresourcesinthesystem.SecurityinLinux-basedsystemsismainlyaboutrestrictingaccesstotherootaccount.
Eachfileanddirectoryalsohasanownerandbelongstoexactlyonegroup.Thelevelofaccessaprocesshastoafileordirectoryiscontrolledbyasetofaccesspermissionflags,calledthemodeofthefile.Therearethreecollectionsofthreebits:thefirstcollectionappliestotheownerofthefile,thesecondtothemembersofthesamegroupasthefile,andthelasttoeveryoneelse:therestoftheworld.Thebitsareforread(r),write(w),andexecute(x)permissionsonthefile.Sincethreebitsfitneatlyintoanoctaldigit,theyareusuallyrepresentedinoctal,asshowninthefollowingdiagram:
Thereisafurthergroupofthreebitsthathavespecialmeanings:
SUID(4):Ifthefileisexecutable,itchangestheeffectiveUIDoftheprocesstothatoftheownerofthefilewhentheprogramisrun.SGID(2):SimilartoSUID,thischangestheeffectiveGIDoftheprocesstothatofthegroupofthefile.Sticky(1):Inadirectory,thisrestrictsdeletionsothatoneusercannotdeletefilesthatareownedbyanotheruser.Thisisusuallyseton/tmpand
/var/tmp.
TheSUIDbitisprobablyusedmostoften.Itgivesnon-rootusersatemporaryprivilegeescalationtosuperusertoperformatask.Agoodexampleisthepingprogram:pingopensarawsocket,whichisaprivilegedoperation.Inorderfornormaluserstouseping,itisownedbyuserrootandhastheSUIDbitsetsothatwhenyourunping,itexecuteswithUID0regardlessofyourUID.
Tosetthesebits,usetheoctalnumbers,4,2,and1withthechmodcommand.Forexample,tosetSUIDon/bin/pinginyourstagingrootdirectory,youcouldusethefollowing:
$cd~/rootfs
$ls-lbin/ping
-rwxr-xr-x1rootroot35712Feb609:15bin/ping
$sudochmod4755bin/ping
$ls-lbin/ping
-rwsr-xr-x1rootroot35712Feb609:15bin/ping
Notethatthesecondlscommandshowsthefirstthreebitsofthemodetoberws,whereaspreviously,theyhadbeenrwx.That's'indicatesthattheSUIDbitisset.
FileownershippermissionsinthestagingdirectoryForsecurityandstabilityreasons,itisvitallyimportanttopayattentiontotheownershipandpermissionsofthefilesthatwillbeplacedonthetargetdevice.Generallyspeaking,youwanttorestrictsensitiveresourcestobeaccessibleonlybytherootandwhereverpossible,torunprogramsusingnon-rootuserssothatiftheyarecompromisedbyanoutsideattack,theyofferasfewsystemresourcestotheattackeraspossible.Forexample,thedevicenodecalled/dev/memgivesaccesstosystemmemory,whichisnecessaryinsomeprograms.But,ifitisreadableandwriteablebyeveryone,thenthereisnosecuritybecauseeveryonecanaccesseverythinginmemory.So,/dev/memshouldbeownedbyroot,belongtotherootgroup,andhaveamodeof600,whichdeniesreadandwriteaccesstoallbuttheowner.
Thereisaproblemwiththestagingdirectorythough.Thefilesyoucreatetherewillbeownedbyyou,butwhentheyareinstalledonthedevice,theyshouldbelongtospecificownersandgroups,mostlytherootuser.Anobviousfixistochangetheownershiptorootatthisstagewiththecommandsshownhere:
$cd~/rootfs
$sudochown-Rroot:root*
Theproblemisthatyouneedrootprivilegestorunthechowncommand,andfromthatpointonward,youwillneedtoberoottomodifyanyfilesinthestagingdirectory.Beforeyouknowit,youaredoingallyourdevelopmentloggedonasroot,whichisnotagoodidea.Thisisaproblemthatwewillcomebacktolater.
ProgramsfortherootfilesystemNow,itistimetostartpopulatingtherootfilesystemwiththeessentialprogramsandthesupportinglibraries,configuration,anddatafilesthattheyneedtooperate.Iwillbeginwithanoverviewofthetypesofprogramsyouwillneed.
TheinitprogramInitisthefirstprogramtoberun,andsoitisanessentialpartoftherootfilesystem.Inthischapter,wewillbeusingthesimpleinitprogramprovidedbyBusyBox.
ShellWeneedashelltorunscriptsandtogiveusacommandpromptsothatwecaninteractwiththesystem.Aninteractiveshellisprobablynotnecessaryinaproductiondevice,butitisusefulfordevelopment,debugging,andmaintenance.Therearevariousshellsincommonuseinembeddedsystems:
bash:ThisisthebigbeastthatweallknowandlovefromdesktopLinux.ItisasupersetoftheUnixBourneshellwithmanyextensionsorbashisms.ash:AlsobasedontheBourneshell,ithasalonghistorywiththeBSDvariantsofUnix.BusyBoxhasaversionofash,whichhasbeenextendedtomakeitmorecompatiblewithbash.Itismuchsmallerthanbash,andhenceitisaverypopularchoiceforembeddedsystems.hush:ThisisaverysmallshellthatwebrieflylookedatinChapter3,AllaboutBootloaders.Itisusefulondeviceswithverylittlememory.ThereisaversionofhushinBusyBox.
Ifyouareusingashorhushastheshellonthetarget,makesurethatyoutestyourshellscriptsonthetarget.Itisverytemptingtotestthemonlyonthehost,usingbash,andthenbesurprisedthattheydon'tworkwhenyoucopythemtothetarget.
UtilitiesTheshellisjustawayoflaunchingotherprograms,andashellscriptislittlemorethanalistofprogramstorun,withsomeflowcontrolandameansofpassinginformationbetweenprograms.Tomakeashelluseful,youneedtheutilityprogramsthattheUnixcommandlineisbasedon.Evenforabasicrootfilesystem,youneedapproximately50utilities,whichpresentstwoproblems.Firstly,trackingdownthesourcecodeforeachoneandcross-compilingitwouldbequiteabigjob.Secondly,theresultingcollectionofprogramswouldtakeupseveraltensofmegabytes,whichwasarealproblemintheearlydaysofembeddedLinuxwhenafewmegabyteswasallyouhad.Tosolvethisproblem,BusyBoxwasborn.
BusyBoxtotherescue!ThegenesisofBusyBoxhadnothingtodowithembeddedLinux.Theprojectwasinstigatedin1996byBrucePerensfortheDebianinstallersothathecouldbootLinuxfroma1.44MBfloppydisk.Coincidentally,thiswasaboutthesizeofthestorageoncontemporarydevices,andsotheembeddedLinuxcommunityquicklytookitup.BusyBoxhasbeenattheheartofembeddedLinuxeversince.
BusyBoxwaswrittenfromscratchtoperformtheessentialfunctionsofthoseessentialLinuxutilities.Thedeveloperstookadvantageofthe80:20rule:themostuseful80%ofaprogramisimplementedin20%ofthecode.Hence,BusyBoxtoolsimplementasubsetofthefunctionsofthedesktopequivalents,buttheydoenoughofittobeusefulinthemajorityofcases.
AnothertrickBusyBoxemploysistocombineallthetoolstogetherintoasinglebinary,makingiteasytosharecodebetweenthem.Itworkslikethis:BusyBoxisacollectionofapplets,eachofwhichexportsitsmainfunctionintheform[applet]_main.Forexample,thecatcommandisimplementedincoreutils/cat.candexportscat_main.ThemainfunctionofBusyBoxitselfdispatchesthecalltothecorrectapplet,basedonthecommand-linearguments.
So,toreadafile,youcanlaunchBusyBoxwiththenameoftheappletyouwanttorun,followedbyanyargumentstheappletexpects,asshownhere:
$busyboxcatmy_file.txt
YoucanalsorunBusyBoxwithnoargumentstogetalistofalltheappletsthathavebeencompiled.
UsingBusyBoxinthiswayisratherclumsy.AbetterwaytogetBusyBoxtorunthecatappletistocreateasymboliclinkfrom/bin/catto/bin/busybox:
$ls-lbin/catbin/busybox
-rwxr-xr-x1rootroot892868Feb211:01bin/busybox
lrwxrwxrwx1rootroot7Feb211:01bin/cat->busybox
Whenyoutypecatatthecommandline,BusyBoxistheprogramthatactually
runs.BusyBoxonlyhastocheckthecommandtailpassedinargv[0],whichwillbe/bin/cat,extracttheapplicationname,cat,anddoatablelook-uptomatchcatwithcat_main.Allthisisinlibbb/appletlib.cinthissectionofcode(slightlysimplified):
applet_name=argv[0];
applet_name=bb_basename(applet_name);
run_applet_and_exit(applet_name,argv);
BusyBoxhasoverthreehundredappletsincludinganinitprogram,severalshellsofvaryinglevelsofcomplexity,andutilitiesformostadmintasks.Thereisevenasimpleversionofthevieditor,soyoucanchangetextfilesonyourdevice.
Tosummarize,atypicalinstallationofBusyBoxconsistsofasingleprogramwithasymboliclinkforeachapplet,butwhichbehavesexactlyasifitwereacollectionofindividualapplications.
BuildingBusyBoxBusyBoxusesthesameKconfigandKbuildsystemasthekernel,socrosscompilingisstraightforward.YoucangetthesourcebycloningtheGitarchiveandcheckingouttheversionyouwant(1_26_2wasthelatestatthetimeofwriting),suchasfollows:
$gitclonegit://busybox.net/busybox.git
$cdbusybox
$gitcheckout1_26_2
YoucanalsodownloadthecorrespondingTARfilefromhttp://busybox.net/downloads.
Then,configureBusyBoxbystartingwiththedefaultconfiguration,whichenablesprettymuchallofthefeaturesofBusyBox:
$makedistclean
$makedefconfig
Atthispoint,youprobablywanttorunmakemenuconfigtofinetunetheconfiguration.Forexample,youalmostcertainlywanttosettheinstallpathinBusyboxSettings|InstallationOptions(CONFIG_PREFIX)topointtothestagingdirectory.Then,youcancrosscompileintheusualway.IfyourintendedtargetistheBeagleBoneBlack,usethiscommand:
$makeARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-
IfyourintendedtargetistheQEMUemulationofaVersatilePB,usethiscommand:
$makeARCH=armCROSS_COMPILE=arm-unknown-linux-gnueabi-
Ineithercase,theresultistheexecutable,busybox.Foradefaultconfigurationbuildlikethis,thesizeisabout900KiB.Ifthisistoobigforyou,youcanslimitdownbychangingtheconfigurationtoleaveouttheutilitiesyoudon'tneed.
ToinstallBusyBoxintothestagingarea,usethefollowingcommand:
$makeARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-install
ThiswillcopythebinarytothedirectoryconfiguredinCONFIG_PREFIXandcreateallthesymboliclinkstoit.
ToyBox–analternativetoBusyBoxBusyBoxisnottheonlygameintown.Inaddition,thereisToyBox,whichyoucanfindathttp://landley.net/toybox/.TheprojectwasstartedbyRobLandley,whowaspreviouslyamaintainerofBusyBox.ToyBoxhasthesameaimasBusyBox,butwithmoreemphasisoncomplyingwithstandards,especiallyPOSIX-2008andLSB4.1,andlessoncompatibilitywithGNUextensionstothosestandards.ToyBoxissmallerthanBusyBox,partlybecauseitimplementsfewerapplets.However,themaindifferenceisthelicense,whichisBSDratherthanGPLv2.ThismakesitlicensecompatiblewithoperatingsystemswithaBSD-licenseduserspace,suchasAndroid,andhenceitispartofallthenewAndroiddevices.
LibrariesfortherootfilesystemProgramsarelinkedwithlibraries.Youcouldlinkthemallstatically,inwhichcase,therewouldbenolibrariesonthetargetdevice.But,thistakesupanunnecessarilylargeamountofstorageifyouhavemorethantwoorthreeprograms.So,youneedtocopysharedlibrariesfromthetoolchaintothestagingdirectory.Howdoyouknowwhichlibraries?
Oneoptionistocopyallofthe.sofilesfromthesysrootdirectoryofyourtoolchain,sincetheymustbeofsomeuseotherwisetheywouldn'texist!Thisiscertainlylogicaland,ifyouarecreatingaplatformtobeusedbyothersforarangeofapplications,itwouldbethecorrectapproach.Beaware,though,thatafullglibcisquitelarge.Inthecaseofacrosstool-NGbuildofglibc2.22,thelibraries,locales,andothersupportingfilescometo33MiB.Ofcourse,youcouldcutdownonthatconsiderablyusingmusllibcoruClibc-ng.
Anotheroptionistocherrypickonlythoselibrariesthatyourequire,forwhichyouneedameansofdiscoveringlibrarydependencies.UsingsomeofourknowledgefromChapter2,LearningAboutToolchains,wecanusethereadelfcommandforthistask:
$cd~/rootfs
$arm-cortex_a8-linux-gnueabihf-readelf-abin/busybox|grep"programinterpreter"
[Requestingprograminterpreter:/lib/ld-linux-armhf.so.3]
$arm-cortex_a8-linux-gnueabihf-readelf-abin/busybox|grep"Sharedlibrary"
0x00000001(NEEDED)Sharedlibrary:[libm.so.6]
0x00000001(NEEDED)Sharedlibrary:[libc.so.6]
Now,youneedtofindthesefilesinthetoolchainsysrootdirectoryandcopythemtothestagingdirectory.Rememberthatyoucanfindsysrootlikethis:
$arm-cortex_a8-linux-gnueabihf-gcc-print-sysroot
/home/chris/x-tools/arm-cortex_a8-linux-gnueabihf/arm-cortex_a8-linux-gnueabihf/sysroot
Toreducetheamountoftyping,Iamgoingtokeepacopyofthatinashellvariable:
$exportSYSROOT=$(arm-cortex_a8-linux-gnueabihf-gcc-print-sysroot)
Ifyoulookat/lib/ld-linux-armhf.so.3insysroot,youwillseethat,itis,infact,asymboliclink:
$cd$SYSROOT
$ls-llib/ld-linux-armhf.so.3
lrwxrwxrwx1chrischris10Mar315:22lib/ld-linux-armhf.so.3->ld-2.22.so
Repeattheexerciseforlibc.so.6andlibm.so.6,andyouwillendupwithalistofthreefilesandthreesymboliclinks.Now,youcancopyeachoneusingcp-a,whichwillpreservethesymboliclink:
$cd~/rootfs
$cp-a$SYSROOT/lib/ld-linux-armhf.so.3lib
$cp-a$SYSROOT/lib/ld-2.22.solib
$cp-a$SYSROOT/lib/libc.so.6lib
$cp-a$SYSROOT/lib/libc-2.22.solib
$cp-a$SYSROOT/lib/libm.so.6lib
$cp-a$SYSROOT/lib/libm-2.22.solib
Repeatthisprocedureforeachprogram.
Itisonlyworthdoingthistogettheverysmallestembeddedfootprintpossible.Thereisadangerthatyouwillmisslibrariesthatareloadedthroughdlopen(3)calls–pluginsmostly.Wewilllookatanexamplewiththenameserviceswitch(NSS)librarieswhenwecometoconfigurenetworkinterfaceslateroninthischapter.
ReducingthesizebystrippingLibrariesandprogramsareoftencompiledwithsomeinformationstoredinsymboltablestoaiddebuggingandtracing.Youseldomneedtheseinaproductionsystem.Aquickandeasywaytosavespaceistostripthebinariesofsymboltables.Thisexampleshowslibcbeforestripping:
$filerootfs/lib/libc-2.22.so
lib/libc-2.22.so:ELF32-bitLSBsharedobject,ARM,EABI5version1(GNU/Linux),dynamicallylinked(usessharedlibs),forGNU/Linux4.3.0,notstripped
$ls-ogrootfs/lib/libc-2.22.so
-rwxr-xr-x11542572Mar315:22rootfs/lib/libc-2.22.so
Now,let'sseetheresultofstrippingdebuginformation:
$arm-cortex_a8-linux-gnueabihf-striprootfs/lib/libc-2.22.so
$filerootfs/lib/libc-2.22.so
rootfs/lib/libc-2.22.so:ELF32-bitLSBsharedobject,ARM,EABI5version1(GNU/Linux),dynamicallylinked(usessharedlibs),forGNU/Linux4.3.0,
$ls-ogrootfs/lib/libc-2.22.so
-rwxr-xr-x11218200Mar2219:57rootfs/lib/libc-2.22.so
Inthiscase,wesaved324,372bytes,orabout20%ofthesizeofthefilebeforestripping.
Becarefulaboutstrippingkernelmodules.Somesymbolsarerequiredbythemoduleloadertorelocatethemodulecode,andsothemodulewillfailtoloadiftheyarestrippedout.Usethiscommandtoremovedebugsymbolswhilekeepingthoseusedforrelocation:strip--strip-unneeded<modulename>.
DevicenodesMostdevicesinLinuxarerepresentedbydevicenodes,inaccordancewiththeUnixphilosophythateverythingisafile(exceptnetworkinterfaces,whicharesockets).Adevicenodemayrefertoablockdeviceoracharacterdevice.Blockdevicesaremassstoragedevices,suchasSDcardsorharddrives.Acharacterdeviceisprettymuchanythingelse,onceagainwiththeexceptionofnetworkinterfaces.Theconventionallocationfordevicenodesisthedirectorycalled/dev.Forexample,aserialportmayberepresentedbythedevicenodecalled/dev/ttyS0.
Devicenodesarecreatedusingtheprogramnamedmknod(shortformakenode):
mknod<name><type><major><minor>
Theparameterstomknodareasfollows:
nameisthenameofthedevicenodethatyouwanttocreate.typeiseithercforcharacterdevicesorbforablock.majorandminorareapairofnumbers,whichareusedbythekerneltoroutefilerequeststotheappropriatedevicedrivercode.ThereisalistofstandardmajorandminornumbersinthekernelsourceinthefileDocumentation/devices.txt.
Youwillneedtocreatedevicenodesforallthedevicesyouwanttoaccessonyoursystem.Youcandosomanuallyusingthemknodcommand,asIwillillustratehere;oryoucancreatethemautomaticallyatruntimeusingoneofthedevicemanagersthatIwillmentionlater.
Inareallyminimalrootfilesystem,youneedjusttwonodestobootwithBusyBox:consoleandnull.Theconsoleonlyneedstobeaccessibletoroot,theownerofthedevicenode,sotheaccesspermissionsare600.Thenulldeviceshouldbereadableandwritablebyeveryone,sothemodeis666.Youcanusethe-moptionformknodtosetthemodewhencreatingthenode.Youneedtoberoottocreatedevicenodes,asshownhere:
$cd~/rootfs
$sudomknod-m666dev/nullc13
$sudomknod-m600dev/consolec51
$ls-ldev
total0
crw-------1rootroot5,1Mar2220:01console
crw-rw-rw-1rootroot1,3Mar2220:01null
Youcandeletedevicenodesusingthestandardrmcommand:thereisnormnodcommandbecause,oncecreated,theyarejustfiles.
Theprocandsysfsfilesystemsprocandsysfsaretwopseudofilesystemsthatgiveawindowontotheinnerworkingsofthekernel.Theybothrepresentkerneldataasfilesinahierarchyofdirectories:whenyoureadoneofthefiles,thecontentsyouseedonotcomefromdiskstorage;ithasbeenformattedon-the-flybyafunctioninthekernel.Somefilesarealsowritable,meaningthatakernelfunctioniscalledwiththenewdatayouhavewrittenand,ifitisofthecorrectformatandyouhavesufficientpermissions,itwillmodifythevaluestoredinthekernel'smemory.Inotherwords,procandsysfsprovideanotherwaytointeractwithdevicedriversandotherkernelcode.Theprocandsysfsfilesystemsshouldbemountedonthedirectoriescalled/procand/sys:
#mount-tprocproc/proc
#mount-tsysfssysfs/sys
Althoughtheyareverysimilarinconcept,theyperformdifferentfunctions.prochasbeenpartofLinuxsincetheearlydays.Itsoriginalpurposewastoexposeinformationaboutprocessestouserspace,hencethename.Tothisend,thereisadirectoryforeachprocessnamed/proc/<PID>,whichcontainsinformationaboutitsstate.Theprocesslistcommand,ps,readsthesefilestogenerateitsoutput.Inaddition,therearefilesthatgiveinformationaboutotherpartsofthekernel,forexample,/proc/cpuinfotellsyouabouttheCPU,/proc/interruptshasinformationaboutinterrupts,andsoon.
Finally,in/proc/sys,therearefilesthatdisplayandcontrolthestateandbehaviorofkernelsubsystems,especiallyscheduling,memorymanagement,andnetworking.Themanualpageisthebestreferenceforthefilesyouwillfindintheprocdirectory,whichyoucanseebytypingman5proc.
Ontheotherhand,theroleofsysfsistopresentthekerneldrivermodeltouserspace.Itexportsahierarchyoffilesrelatingtodevicesanddevicedriversandthewaytheyareconnectedtoeachother.IwillgointomoredetailontheLinuxdrivermodelwhenIdescribetheinteractionwithdevicedriversinChapter9,InterfacingwithDeviceDrivers.
MountingfilesystemsThemountcommandallowsustoattachonefilesystemtoadirectorywithinanother,formingahierarchyoffilesystems.Theoneatthetop,whichwasmountedbythekernelwhenitbooted,iscalledtherootfilesystem.Theformatofthemountcommandisasfollows:
mount[-tvfstype][-ooptions]devicedirectory
Youneedtospecifythetypeofthefilesystem,vfstype,theblockdevicenodeitresideson,andthedirectoryyouwanttomountitto.Therearevariousoptionsyoucangiveafter-o;havealookatthemanualpagemount(8)formoreinformation.Asanexample,ifyouwanttomountanSDcardcontaininganext4filesysteminthefirstpartitionontothedirectorycalled/mnt,youwouldtypethefollowingcode:
#mount-text4/dev/mmcblk0p1/mnt
Assumingthemountsucceeds,youwouldbeabletoseethefilesstoredontheSDcardinthedirectory:/mnt.Insomecases,youcanleaveoutthefilesystemtype,andletthekernelprobethedevicetofindoutwhatisstoredthere.
Lookingattheexampleofmountingtheprocfilesystem,thereissomethingodd:thereisnodevicenode,suchas/dev/proc,sinceitisapseudofilesystemandnotarealone.Butthemountcommandrequiresadeviceparameter.Consequently,wehavetogiveastringwheredeviceshouldgo,butitdoesnotmattermuchwhatthatstringis.Thesetwocommandsachieveexactlythesameresult:
#mount-tprocprocfs/proc
#mount-tprocnodevice/proc
Thestrings"procfs"and"nodevice"areignoredbythemountcommand.Itisfairlycommontousethefilesystemtypeintheplaceofthedevicewhenmountingpseudofilesystems.
KernelmodulesIfyouhavekernelmodules,theyneedtobeinstalledintotherootfilesystem,usingthekernelmaketargetmodules_install,aswesawinthelastchapter.Thiswillcopythemintothedirectorycalled/lib/modules/<kernelversion>togetherwiththeconfigurationfilesneededbythemodprobecommand.
Beawarethatyouhavejustcreatedadependencybetweenthekernelandtherootfilesystem.Ifyouupdateone,youwillhavetoupdatetheother.
TransferringtherootfilesystemtothetargetAfterhavingcreatedaskeletonrootfilesysteminyourstagingdirectory,thenexttaskistotransferittothetarget.Inthesectionsthatfollow,Iwilldescribethreepossibilities:
initramfs:Alsoknownasaramdisk,thisisafilesystemimagethatisloadedintoRAMbythebootloader.Ramdisksareeasytocreateandhavenodependenciesonmassstoragedrivers.Theycanbeusedinfallbackmaintenancemodewhenthemainrootfilesystemneedsupdating.Theycanevenbeusedasthemainrootfilesysteminsmallembeddeddevices,andtheyarecommonlyusedastheearlyuserspaceinmainstreamLinuxdistributions.Rememberthatthecontentsoftherootfilesystemarevolatile,andanychangesyoumakeintherootfilesystematruntimewillbelostwhenthesystemnextboots.Youwouldneedanotherstoragetypetostorepermanentdatasuchasconfigurationparameters.Diskimage:Thisisacopyoftherootfilesystemformattedandreadytobeloadedontoamassstoragedeviceonthetarget.Forexample,itcouldbeanimageintheext4formatreadytobecopiedontoanSDcard,oritcouldbeinthejffs2formatreadytobeloadedintoflashmemoryviathebootloader.Creatingadiskimageisprobablythemostcommonoption.ThereismoreinformationaboutthedifferenttypesofmassstorageinChapter7,CreatingaStorageStrategy.Networkfilesystem:ThestagingdirectorycanbeexportedtothenetworkviaanNFSserverandmountedbythetargetatboottime.Thisisoftendoneduringthedevelopmentphase,inpreferencetorepeatedcyclesofcreatingadiskimageandreloadingitontothemassstoragedevice,whichisquiteaslowprocess.
Iwillstartwithramdisk,anduseittoillustrateafewrefinementstotherootfilesystem,suchasaddingusernamesandadevicemanagertocreatedevicenodesautomatically.Then,IwillshowyouhowtocreateadiskimageandhowtouseNFStomounttherootfilesystemoveranetwork.
CreatingabootinitramfsAninitialRAMfilesystem,orinitramfs,isacompressedcpioarchive.cpioisanoldUnixarchiveformat,similartoTARandZIPbuteasiertodecodeandsorequiringlesscodeinthekernel.YouneedtoconfigureyourkernelwithCONFIG_BLK_DEV_INITRDtosupportinitramfs.
Asithappens,therearethreedifferentwaystocreateabootramdisk:asastandalonecpioarchive,asacpioarchiveembeddedinthekernelimage,andasadevicetablewhichthekernelbuildsystemprocessesaspartofthebuild.Thefirstoptiongivesthemostflexibility,becausewecanmixandmatchkernelsandramdiskstoourheart'scontent.However,itmeansthatyouhavetwofilestodealwithinsteadofone,andnotallbootloadershavethefacilitytoloadaseparateramdisk.Iwillshowyouhowtobuildoneintothekernellater.
StandaloneinitramfsThefollowingsequenceofinstructionscreatesthearchive,compressesit,andaddsaU-Bootheaderreadyforloadingontothetarget:
$cd~/rootfs
$find.|cpio-Hnewc-ov--ownerroot:root>../initramfs.cpio
$cd..
$gzipinitramfs.cpio
$mkimage-Aarm-Olinux-Tramdisk-dinitramfs.cpio.gzuRamdisk
Notethatweruncpiowiththeoption:--ownerroot:root.Thisisaquickfixforthefileownershipproblemmentionedearlier,makingeverythinginthecpioarchivehaveUIDandGIDof0.
ThefinalsizeoftheuRamdiskfileisabout2.9MBwithnokernelmodules.Addtothat4.4MBforthekernelzImagefileand440KBforU-Boot,andthisgivesatotalof7.7MBofstorageneededtobootthisboard.Wearealittlewayoffthe1.44MBfloppythatstarteditalloff.Ifsizewasarealproblem,youcoulduseoneoftheseoptions:
Makethekernelsmallerbyleavingoutdriversandfunctionsyoudon'tneedMakeBusyBoxsmallerbyleavingoututilitiesyoudon'tneedUsemusllibcoruClibc-nginplaceofglibcCompileBusyBoxstatically
BootingtheinitramfsThesimplestthingwecandoistorunashellontheconsolesothatwecaninteractwiththetarget.Wecandothatbyaddingrdinit=/bin/shtothekernelcommandline.ThenexttwosectionsshowhowtodothatforbothQEMUandtheBeagleBoneBlack.
BootingwithQEMUQEMUhastheoptioncalled-initrdtoloadinitramfsintomemory.YoushouldalreadyhavefromChapter4,ConfiguringandBuildingtheKernel,azImagecompiledwiththearm-unknown-linux-gnueabitoolchainandthedevicetreebinaryfortheVersatilePB.Fromthischapter,youshouldhavecreatedinitramfs,whichincludesBusyBoxcompiledwiththesametoolchain.Now,youcanlaunchQEMUusingthescriptinMELP/chapter_05/run-qemu-initramfs.shorusingthiscommand:
$QEMU_AUDIO_DRV=none\
qemu-system-arm-m256M-nographic-Mversatilepb-kernelzImage\
-append"console=ttyAMA0rdinit=/bin/sh"-dtbversatile-pb.dtb\
-initrdinitramfs.cpio.gz
Youshouldgetarootshellwiththeprompt/#.
BootingtheBeagleBoneBlackFortheBeagleBoneBlack,weneedthemicroSDcardpreparedinChapter4,ConfiguringandBuildingtheKernel,plusarootfilesystembuiltusingthearm-cortex_a8-linux-gnueabihftoolchain.CopyuRamdiskyoucreatedearlierinthissectiontothebootpartitiononthemicroSDcard,andthenuseittoboottheBeagleBoneBlacktopointthatyougetaU-Bootprompt.Thenenterthesecommands:
fatloadmmc0:10x80200000zImage
fatloadmmc0:10x80f00000am335x-boneblack.dtb
fatloadmmc0:10x81000000uRamdisk
setenvbootargsconsole=ttyO0,115200rdinit=/bin/sh
bootz0x802000000x810000000x80f00000
Ifallgoeswell,youwillgetarootshellwiththeprompt/#ontheserialconsole.
MountingprocYouwillfindthatonbothplatformsthepscommanddoesn'twork.Thisisbecausetheprocfilesystemhasnotbeenmountedyet.Trymountingit:
#mount-tprocproc/proc
Now,runpsagain,andyouwillseetheprocesslisting.
Arefinementtothissetupwouldbetowriteashellscriptthatmountsproc,andanythingelsethatneedstobedoneatboot-up.Then,youcouldrunthisscriptinsteadof/bin/shatboot.Thefollowingsnippetgivesanideaofhowitwouldwork:
#!/bin/sh
/bin/mount-tprocproc/proc
#Otherboot-timecommandsgohere
/bin/sh
Thelastline,/bin/sh,launchesanewshellthatgivesyouaninteractiverootshellprompt.Usingashellasinitinthiswayisveryhandyforquickhacks,forexample,whenyouwanttorescueasystemwithabrokeninitprogram.However,inmostcases,youwoulduseaninitprogram,whichwewillcoverlateroninthischapter.But,beforethis,Iwanttolookattwootherwaystoloadinitramfs.
BuildinganinitramfsintothekernelimageSofar,wehavecreatedacompressedinitramfsasaseparatefileandusedthebootloadertoloaditintomemory.Somebootloadersdonothavetheabilitytoloadaninitramfsfileinthisway.Tocopewiththesesituations,Linuxcanbeconfiguredtoincorporateinitramfsintothekernelimage.Todothis,changethekernelconfigurationandsetCONFIG_INITRAMFS_SOURCEtothefullpathofthecpioarchiveyoucreatedearlier.Ifyouareusingmenuconfig,itisinGeneralsetup|Initramfssourcefile(s).Notethatithastobetheuncompressedcpiofileendingin.cpio,notthegzippedversion.Then,buildthekernel.
Bootingisthesameasbefore,exceptthatthereisnoramdiskfile.ForQEMU,thecommandislikethis:
$QEMU_AUDIO_DRV=none\
qemu-system-arm-m256M-nographic-Mversatilepb-kernelzImage\
-append"console=ttyAMA0rdinit=/bin/sh"-dtbversatile-pb.dtb
FortheBeagleBoneBlack,enterthesecommandsattheU-Bootprompt:
fatloadmmc0:10x80200000zImage
fatloadmmc0:10x80f00000am335x-boneblack.dtb
setenvbootargsconsole=ttyO0,115200rdinit=/bin/sh
bootz0x80200000–0x80f00000
Ofcourse,youmustremembertoregeneratethecpiofileeachtimeyouchangethecontentsoftherootfilesystem,andthenrebuildthekernel.
BuildinganinitramfsusingadevicetableAdevicetableisatextfilethatliststhefiles,directories,devicenodes,andlinksthatgointoanarchiveorfilesystemimage.Theoverwhelmingadvantageisthatitallowsyoutocreateentriesinthearchivefilethatareownedbytherootuser,oranyotherUID,withouthavingrootprivilegesyourself.Youcanevencreatedevicenodeswithoutneedingrootprivileges.Allthisispossiblebecausethearchiveisjustadatafile.ItisonlywhenitisexpandedbyLinuxatboottimethatrealfilesanddirectoriesgetcreated,usingtheattributesyouhavespecified.
Thekernelhasafeaturethatallowsustouseadevicetablewhencreatinganinitramfs.Youwritethedevicetablefile,andthenpointCONFIG_INITRAMFS_SOURCEatit.Then,whenyoubuildthekernel,itcreatesthecpioarchivefromtheinstructionsinthedevicetable.Atnopointdoyouneedrootaccess.
Hereisadevicetableforoursimplerootfs,butmissingmostofthesymboliclinkstoBusyBoxtomakeitmanageable:
dir/bin77500
dir/sys77500
dir/tmp77500
dir/dev77500
nod/dev/null66600c13
nod/dev/console60000c51
dir/home77500
dir/proc77500
dir/lib77500
slink/lib/libm.so.6libm-2.22.so77700
slink/lib/libc.so.6libc-2.22.so77700
slink/lib/ld-linux-armhf.so.3ld-2.22.so77700
file/lib/libm-2.22.so/home/chris/rootfs/lib/libm-2.22.so75500
file/lib/libc-2.22.so/home/chris/rootfs/lib/libc-2.22.so75500
file/lib/ld-2.22.so/home/chris/rootfs/lib/ld-2.22.so75500
Thesyntaxisfairlyobvious:
dir<name><mode><uid><gid>
file<name><location><mode><uid><gid>
nod<name><mode><uid><gid><dev_type><maj><min>
slink<name><target><mode><uid><gid>
Thecommandsdir,nod,andslinkcreateafilesystemobjectintheinitramfscpioarchivewiththename,mode,userIDandgroupIDgiven.Thefilecommandcopiesthefilefromthesourcelocationintothearchiveandsetsthemode,theuserID,andthegroupID.
Thetaskofcreatinganinitramfsdevicetablefromscratchismadeeasierbyascriptinthekernelsourcecodeinscripts/gen_initramfs_list.sh,whichcreatesadevicetablefromagivendirectory.Forexample,tocreatetheinitramfsdevicetablefordirectoryrootfs,andtochangetheownershipofallfilesownedbyuserID1000andgroupID1000touserandgroupID0,youwouldusethiscommand:
$bashlinux-stable/scripts/gen_initramfs_list.sh-u1000-g1000\
rootfs>initramfs-device-table
Notethatthescriptonlyworkswithabashshell.Ifyouhaveasystemwithadifferentdefaultshell,asisthecasewithmostUbuntuconfigurations,youwillfindthatthescriptfails.Hence,inthecommandgivenpreviously,Iexplicitlyusedbashtorunthescript.
TheoldinitrdformatThereisanolderformatforaLinuxramdisk,knownasinitrd.ItwastheonlyformatavailablebeforeLinux2.6andisstillneededifyouareusingthemmu-lessvariantofLinux,uClinux.ItisprettyobscureandIwillnotcoverithere.ThereismoreinformationinthekernelsourceinDocumentation/initrd.txt.
TheinitprogramRunningashell,orevenashellscript,atboottimeisfineforsimplecases,butreallyyouneedsomethingmoreflexible.Normally,Unixsystemsrunaprogramcalledinitthatstartsupandmonitorsotherprograms.Overtheyears,therehavebeenmanyinitprograms,someofwhichIwilldescribeinChapter9,InterfacingwithDeviceDrivers.Fornow,IwillbrieflyintroducetheinitfromBusyBox.
Theinitprogrambeginsbyreadingtheconfigurationfile,/etc/inittab.Hereisasimpleexample,whichisadequateforourneeds:
::sysinit:/etc/init.d/rcS
::askfirst:-/bin/ash
Thefirstlinerunsashellscript,rcS,wheninitisstarted.ThesecondlineprintsthemessagePleasepressEntertoactivatethisconsoletotheconsoleandstartsashellwhenyoupressEnter.Theleading-before/bin/ashmeansthatitwillbecomealoginshell,whichsources/etc/profileand$HOME/.profilebeforegivingtheshellprompt.Oneoftheadvantagesoflaunchingtheshelllikethisisthatjobcontrolisenabled.ThemostimmediateeffectisthatyoucanuseCtrl+Ctoterminatethecurrentprogram.Maybeyoudidn'tnoticeitbeforebut,waituntilyourunthepingprogramandfindyoucan'tstopit!
BusyBoxinitprovidesadefaultinittabifnoneispresentintherootfilesystem.Itisalittlemoreextensivethantheprecedingone.
Thescriptcalled/etc/init.d/rcSistheplacetoputinitializationcommandsthatneedtobeperformedatboot,forexample,mountingtheprocandsysfsfilesystems:
#!/bin/sh
mount-tprocproc/proc
mount-tsysfssysfs/sys
MakesurethatyoumakercSexecutablelikethis:
$cd~/rootfs
$chmod+xetc/init.d/rcS
YoucantryitoutonQEMUbychangingthe-appendparameterlikethis:
-append"console=ttyAMA0rdinit=/sbin/init"
FortheBeagleBoneBlack,youneedtosetthebootargsvariableinU-Bootasshownhere:
setenvbootargsconsole=ttyO0,115200rdinit=/sbin/init
StartingadaemonprocessTypically,youwouldwanttoruncertainbackgroundprocessesatstartup.Let'stakethelogdaemon,syslogd,asanexample.Thepurposeofsyslogdistoaccumulatelogmessagesfromotherprograms,mostlyotherdaemons.Naturally,BusyBoxhasanappletforthat!
Startingthedaemonisassimpleasaddingalinelikethistoetc/inittab:
::respawn:/sbin/syslogd-n
respawnmeansthatiftheprogramterminates,itwillbeautomaticallyrestarted;-nmeansthatitshouldrunasaforegroundprocess.Thelogiswrittento/var/log/messages.
Youmayalsowanttostartklogdinthesameway:klogdsendskernellogmessagestosyslogdsothattheycanbeloggedtopermanentstorage.
ConfiguringuseraccountsAsIhavehintedalready,itisnotgoodpracticetorunallprogramsasroot,sinceifoneiscompromisedbyanoutsideattack,thenthewholesystemisatrisk.Itispreferabletocreateunprivilegeduseraccountsandusethemwherefullrootisnotnecessary.
Usernamesareconfiguredin/etc/passwd.Thereisonelineperuser,withsevenfieldsofinformationseparatedbycolons,whichareinorder:
TheloginnameAhashcodeusedtoverifythepassword,ormoreusuallyanxtoindicatethatthepasswordisstoredin/etc/shadowTheuserIDThegroupIDAcommentfield,oftenleftblankTheuser'shomedirectory(Optional)theshellthisuserwilluse
HereisasimpleexampleinwhichwehaveuserrootwithUID0,anduserdaemonwithUID1:
root:x:0:0:root:/root:/bin/sh
daemon:x:1:1:daemon:/usr/sbin:/bin/false
Settingtheshellforuserdaemonto/bin/falseensuresthatanyattempttologonwiththatnamewillfail.
Variousprogramshavetoread/etc/passwdinordertolookupusernamesandUIDs,andsothefilehastobeworldreadable.Thisisaproblemifthepasswordhashesarestoredinthereaswell,becauseamaliciousprogramwouldbeabletotakeacopyanddiscovertheactualpasswordsusingavarietyofcrackerprograms.Therefore,toreducetheexposureofthissensitiveinformation,thepasswordsarestoredin/etc/shadowandxisplacedinthepasswordfieldtoindicatethatthisisthecase.Thefilecalled/etc/shadowonlyneedstobeaccessedbyroot,soaslongastherootuserisnotcompromised,thepasswordsaresafe.
Theshadowpasswordfileconsistsofoneentryperuser,madeupofninefields.Hereisanexamplethatmirrorsthepasswordfileshownintheprecedingparagraph:
root::10933:0:99999:7:::
daemon:*:10933:0:99999:7:::
Thefirsttwofieldsaretheusernameandthepasswordhash.Theremainingsevenfieldsarerelatedtopasswordaging,whichisnotusuallyanissueonembeddeddevices.Ifyouarecuriousaboutthefulldetails,refertothemanualpageforshadow(5).
Intheexample,thepasswordforrootisempty,meaningthatrootcanlogonwithoutgivingapassword.Havinganemptypasswordforrootisusefulduringdevelopmentbutnotforproduction.Youcangenerateorchangeapasswordhashbyrunningthepasswdcommandonthetarget,whichwillwriteanewhashto/etc/shadow.Ifyouwantallsubsequentrootfilesystemstohavethissamepassword,youcouldcopythisfilebacktothestagingdirectory.
Groupnamesarestoredinasimilarwayin/etc/group.Thereisonelinepergroupconsistingoffourfieldsseparatedbycolons.Thefieldsarehere:
ThenameofthegroupThegrouppassword,usuallyanxcharacter,indicatingthatthereisnogrouppasswordTheGIDorgroupIDAnoptionallistofuserswhobelongtothisgroup,separatedbycommas
Hereisanexample:
root:x:0:
daemon:x:1:
AddinguseraccountstotherootfilesystemFirstly,youhavetoaddtoyourstagingdirectorythefilesetc/passwd,etc/shadow,andetc/group,asshownintheprecedingsection.Makesurethatthepermissionsofshadoware0600.Next,youneedtoinitiatetheloginprocedurebystartingaprogramcalledgetty.ThereisaversionofgettyinBusyBox.Youlaunchitfrominittabusingthekeywordrespawn,whichrestartsgettywhenaloginshellisterminated,soinittabshouldreadlikethis:
::sysinit:/etc/init.d/rcS
::respawn:/sbin/getty115200console
Then,rebuildtheramdiskandtryitoutusingQEMUortheBeagleBoneBlackasbefore.
AbetterwayofmanagingdevicenodesCreatingdevicenodesstaticallywithmknodisquitehardworkandinflexible.Thereareotherwaystocreatedevicenodesautomaticallyondemand:
devtmpfs:Thisisapseudofilesystemthatyoumountover/devatboottime.Thekernelpopulatesitwithdevicenodesforallthedevicesthatthekernelcurrentlyknowsabout,anditcreatesnodesfornewdevicesastheyaredetectedatruntime.Thenodesareownedbyrootandhavedefaultpermissionsof0600.Somewell-knowndevicenodes,suchas/dev/nulland/dev/random,overridethedefaultto0666.Toseeexactlyhowthisisdone,takealookattheLinuxsourcefile:drivers/char/mem.candseehowstructmemdevisinitialized.mdev:ThisisaBusyBoxappletthatisusedtopopulateadirectorywithdevicenodesandtocreatenewnodesasneeded.Thereisaconfigurationfile,/etc/mdev.conf,whichcontainsrulesforownershipandthemodeofthenodes.udev:Thisisthemainstreamequivalentofmdev.YouwillfinditondesktopLinuxandinsomeembeddeddevices.Itisveryflexibleandagoodchoiceforhigherendembeddeddevices.Itisnowpartofsystemd.
Althoughbothmdevandudevcreatethedevicenodesthemselves,itismoreusualtoletdevtmpfsdothejobandusemdev/udevasalayerontoptoimplementthepolicyforsettingownershipandpermissions.
AnexampleusingdevtmpfsSupportforthedevtmpfsfilesystemiscontrolledbykernelconfigurationvariable:CONFIG_DEVTMPFS.ItisnotenabledinthedefaultconfigurationoftheARMVersatilePB,soifyouwanttotryoutthefollowingusingthistarget,youwillhavetogobackandenablethisoption.Tryingoutdevtmpfsisassimpleasenteringthiscommand:
#mount-tdevtmpfsdevtmpfs/dev
Youwillnoticethatafterward,therearemanymoredevicenodesin/dev.Forapermanentfix,addthisto/etc/init.d/rcS:
#!/bin/sh
mount-tprocproc/proc
mount-tsysfssysfs/sys
mount-tdevtmpfsdevtmpfs/dev
IfyouenableCONFIG_DEVTMPFS_MOUNTinyourkernelconfiguration,thekernelwillautomaticallymountdevtmpfsjustaftermountingtherootfilesystem.However,thisoptionhasnoeffectwhenbootinginitramfs,aswearedoinghere.
AnexampleusingmdevWhilemdevisabitmorecomplextosetup,itdoesallowyoutomodifythepermissionsofdevicenodesastheyarecreated.Youbeginbyrunningmdevwiththe-soption,whichcausesittoscanthe/sysdirectorylookingforinformationaboutcurrentdevices.Fromthisinformation,itpopulatesthe/devdirectorywiththecorrespondingnodes.Ifyouwanttokeeptrackofnewdevicescomingonlineandcreatenodesforthemaswell,youneedtomakemdevahotplugclientbywritingto/proc/sys/kernel/hotplug.Theseadditionsto/etc/init.d/rcSwillachieveallofthis:
#!/bin/sh
mount-tprocproc/proc
mount-tsysfssysfs/sys
mount-tdevtmpfsdevtmpfs/dev
echo/sbin/mdev>/proc/sys/kernel/hotplug
mdev-s
Thedefaultmodeis660andtheownershipisroot:root.Youcanchangethisbyaddingrulesin/etc/mdev.conf.Forexample,togivethenull,random,andurandomdevicestheircorrectmodes,youwouldaddthisto/etc/mdev.conf:
nullroot:root666
randomroot:root444
urandomroot:root444
TheformatisdocumentedintheBusyBoxsourcecodeindocs/mdev.txt,andtherearemoreexamplesinthedirectorynamedexamples.
Arestaticdevicenodessobadafterall?Staticallycreateddevicenodesdohaveoneadvantageoverrunningadevicemanager:theydon'ttakeanytimeduringboottocreate.Ifminimizingboottimeisapriority,usingstatically-createddevicenodeswillsaveameasurableamountoftime.
ConfiguringthenetworkNext,let'slookatsomebasicnetworkconfigurationssothatwecancommunicatewiththeoutsideworld.IamassumingthatthereisanEthernetinterface,eth0,andthatweonlyneedasimpleIPv4configuration.
TheseexamplesusethenetworkutilitiesthatarepartofBusyBox,andtheyaresufficientforasimpleusecase,usingtheold-but-reliableifupandifdownprograms.Youcanreadthemanualpagesforbothtogetthedetails.Themainnetworkconfigurationisstoredin/etc/network/interfaces.Youwillneedtocreatethesedirectoriesinthestagingdirectory:
etc/network
etc/network/if-pre-up.d
etc/network/if-up.d
var/run
ForastaticIPaddress,/etc/network/interfaceswouldlooklikethis:
autolo
ifaceloinetloopback
autoeth0
ifaceeth0inetstatic
address192.168.1.101
netmask255.255.255.0
network192.168.1.0
ForadynamicIPaddressallocatedusingDHCP,/etc/network/interfaceswouldlooklikethis:
autolo
ifaceloinetloopback
autoeth0
ifaceeth0inetdhcp
YouwillalsohavetoconfigureaDHCPclientprogram.BusyBoxhasonenamedudchpcd.Itneedsashellscriptthatshouldgoin/usr/share/udhcpc/default.script.ThereisasuitabledefaultintheBusyBoxsourcecodeinthedirectoryexamples/udhcp/simple.script.
Networkcomponentsforglibcglibcusesamechanismknownasthenameserviceswitch(NSS)tocontrolthewaythatnamesareresolvedtonumbersfornetworkingandusers.Usernames,forexample,mayberesolvedtoUIDsviathefile/etc/passwd,andnetworkservicessuchasHTTPcanberesolvedtotheserviceportnumbervia/etc/services.Allthisisconfiguredby/etc/nsswitch.conf;seethemanualpage,nss(5),forfulldetails.HereisasimpleexamplethatwillsufficeformostembeddedLinuximplementations:
passwd:files
group:files
shadow:files
hosts:filesdns
networks:files
protocols:files
services:files
Everythingisresolvedbythecorrespondinglynamedfilein/etc,exceptforthehostnames,whichmayadditionallyberesolvedbyaDNSlookup.
Tomakethiswork,youneedtopopulate/etcwiththosefiles.Networks,protocols,andservicesarethesameacrossallLinuxsystems,sotheycanbecopiedfrom/etcinyourdevelopmentPC./etc/hostsshould,atleast,containtheloopbackaddress:
127.0.0.1localhost
Theotherfiles,passwd,group,andshadow,havebeendescribedearlierinthesectionConfiguringuseraccounts.
Thelastpieceofthejigsawisthelibrariesthatperformthenameresolution.Theyarepluginsthatareloadedasneededbasedonthecontentsofnsswitch.conf,meaningthattheydonotshowupasdependenciesifyouusereadelforldd.Youwillsimplyhavetocopythemfromthetoolchain'ssysroot:
$cd~/rootfs
$cp-a$SYSROOT/lib/libnss*lib
$cp-a$SYSROOT/lib/libresolv*lib
CreatingfilesystemimageswithdevicetablesWesawearlierinthesectionCreatingabootinitramfsthatthekernelhasanoptiontocreateinitramfsusingadevicetable.Devicetablesarereallyusefulbecausetheyallowanon-rootusertocreatedevicenodesandtoallocatearbitraryUIDandGIDvaluestoanyfileordirectory.Thesameconcepthasbeenappliedtotoolsthatcreateotherfilesystemimageformats,asshowninthistable:
Filesystemformat Tool
jffs2 mkfs.jffs2
ubifs mkfs.ubifs
ext2 genext2fs
Wewilllookatjffs2andubifsinChapter7,CreatingaStorageStrategy,whenwelookatfilesystemsforflashmemory.Thethird,ext2,isaformatcommonlyusedformanagedflashmemory,includingSDcards.Theexamplethatfollowsusesext2tocreateadiskimagethatcanbecopiedtoanSDcard.
Theyeachtakeadevicetablefilewiththeformat<name><type><mode><uid><gid><major><minor><start><inc><count>,wherethemeaningsofthefieldsisasfollows:
name:type:Oneofthefollowing:
f:Aregularfiled:Adirectoryc:Acharacterspecialdevicefileb:Ablockspecialdevicefile
p:AFIFO(namedpipe)uidTheUIDofthefilegid:TheGIDofthefilemajorandminor:Thedevicenumbers(devicenodesonly)start,inc,andcount:Allowyoutocreateagroupofdevicenodesstartingfromtheminornumberinstart(devicenodesonly)
Youdonothavetospecifyeveryfile,asyoudowiththekernelinitramfstable.Youjusthavetopointatadirectory—thestagingdirectory—andlistthechangesandexceptionsyouneedtomakeinthefinalfilesystemimage.
Asimpleexamplewhichpopulatesstaticdevicenodesforusisasfollows:
/devd75500-----
/dev/nullc666001300-
/dev/consolec600005100-
/dev/ttyO0c60000252000-
Then,youcanusegenext2fstogenerateafilesystemimageof4MB(thatis4,096blocksofthedefaultsize,1,024bytes):
$genext2fs-b4096-drootfs-Ddevice-table.txt-Urootfs.ext2
Now,youcancopytheresultingimage,rootfs.ext2,toanSDcardorsimilar,whichwewilldonext.
BootingtheBeagleBoneBlackThescriptcalledMELP/format-sdcard.shcreatestwopartitionsonthemicroSDcard:oneforthebootfilesandonefortherootfilesystem.Assumingthatyouhavecreatedtherootfilesystemimageasshownintheprevioussection,youcanusetheddcommandtowriteittothesecondpartition.Asalways,whencopyingfilesdirectlytostoragedeviceslikethis,makeabsolutelysurethatyouknowwhichisthemicroSDcard.Inthiscase,Iamusingabuilt-incardreader,whichisthedevicecalled/dev/mmcblk0,sothecommandisasfollows:
$sudoddif=rootfs.ext2of=/dev/mmcblk0p2
Then,slotthemicroSDcardintotheBeagleBoneBlack,andsetthekernelcommandlinetoroot=/dev/mmcblk0p2.ThecompletesequenceofU-Bootcommandsisasfollows:
fatloadmmc0:10x80200000zImage
fatloadmmc0:10x80f00000am335x-boneblack.dtb
setenvbootargsconsole=ttyO0,115200root=/dev/mmcblk0p2
bootz0x80200000–0x80f00000
Thisisanexampleofmountingafilesystemfromanormalblockdevice,suchasanSDcard.ThesameprinciplesapplytootherfilesystemtypesandwewilllookattheminmoredetailinChapter7,CreatingaStorageStrategy.
MountingtherootfilesystemusingNFSIfyourdevicehasanetworkinterface,itisoftenusefultomounttherootfilesystemoverthenetworkduringdevelopment.Itgivesyouaccesstothealmostunlimitedstorageonyourhostmachine,soyoucanaddindebugtoolsandexecutableswithlargesymboltables.Asanaddedbonus,updatesmadetotherootfilesystemonthedevelopmentmachinearemadeavailableonthetargetimmediately.Youcanalsoaccessallthetarget'slogfilesfromthehost.
Tobeginwith,youneedtoinstallandconfigureanNFSserveronyourhost.OnUbuntu,thepackagetoinstallisnamednfs-kernel-server:
$sudoapt-getinstallnfs-kernel-server
TheNFSserverneedstobetoldwhichdirectoriesarebeingexportedtothenetwork,whichiscontrolledby/etc/exports.Thereisonelineforeachexport.Theformatisdescribedinthemanualpageexports(5).Asanexample,toexporttherootfilesystemonmyhost,Ihavethis:
/home/chris/rootfs*(rw,sync,no_subtree_check,no_root_squash)
*exportsthedirectorytoanyaddressonmylocalnetwork.Ifyouwish,youcangiveasingleIPaddressorarangeatthispoint.Therefollowsalistofoptionsenclosedinparentheses.Theremustnotbeanyspacesbetween*andtheopeningparenthesis.Theoptionsarehere:
rw:Thisexportsthedirectoryasread-write.sync:ThisoptionselectsthesynchronousversionoftheNFSprotocol,whichismorerobustbutalittleslowerthantheasyncoption.no_subtree_check:Thisoptiondisablessubtreechecking,whichhasmildsecurityimplications,butcanimprovereliabilityinsomecircumstances.no_root_squash:ThisoptionallowsrequestsfromuserID0tobeprocessedwithoutsquashingtoadifferentuserID.Itisnecessarytoallowthetargettoaccesscorrectlythefilesownedbyroot.
Havingmadechangesto/etc/exports,restarttheNFSservertopickthemup.
Now,youneedtosetupthetargettomounttherootfilesystemoverNFS.Forthistowork,yourkernelhastobeconfiguredwithCONFIG_ROOT_NFS.Then,youcanconfigureLinuxtodothemountatboottimebyaddingthefollowingtothekernelcommandline:
root=/dev/nfsrwnfsroot=<host-ip>:<root-dir>ip=<target-ip>
Theoptionsareasfollows:
rw:Thismountstherootfilesystemread-write.nfsroot:ThisspecifiestheIPaddressofthehost,followedbythepathtotheexportedrootfilesystem.ip:ThisistheIPaddresstobeassignedtothetarget.Usually,networkaddressesareassignedatruntime,aswehaveseeninthesectionConfiguringthenetwork.However,inthiscase,theinterfacehastobeconfiguredbeforetherootfilesystemismountedandinithasbeenstarted.Henceitisconfiguredonthekernelcommandline.
ThereismoreinformationaboutNFSrootmountsinthekernelsourceinDocumentation/filesystems/nfs/nfsroot.txt.
TestingwithQEMUThefollowingscriptcreatesavirtualnetworkbetweenthenetworkdevicecalledtap0onthehostandeth0onthetargetusingapairofstaticIPv4addresses,andthenlaunchesQEMUwiththeparameterstousetap0astheemulatedinterface.
YouwillneedtochangethepathtotherootfilesystemtobethefullpathtoyourstagingdirectoryandmaybetheIPaddressesiftheyconflictwithyournetworkconfiguration:
#!/bin/bash
KERNEL=zImage
DTB=versatile-pb.dtb
ROOTDIR=/home/chris/rootfs
HOST_IP=192.168.1.1
TARGET_IP=192.168.1.101
NET_NUMBER=192.168.1.0
NET_MASK=255.255.255.0
sudotunctl-u$(whoami)-ttap0
sudoifconfigtap0${HOST_IP}
sudorouteadd-net${NET_NUMBER}netmask${NET_MASK}devtap0
sudosh-c"echo1>/proc/sys/net/ipv4/ip_forward"
QEMU_AUDIO_DRV=none
qemu-system-arm-m256M-nographic-Mversatilepb-kernel${KERNEL}-append"console=ttyAMA0,115200root=/dev/nfsrwnfsroot=${HOST_IP}:${ROOTDIR}ip=${TARGET_IP}"-dtb${DTB}-netnic-nettap,ifname=tap0,script=no
ThescriptisavailableinMELP/chapter_05/run-qemu-nfsroot.sh.
Itshouldbootupasbefore,nowusingthestagingdirectorydirectlyviatheNFSexport.Anyfilesthatyoucreateinthatdirectorywillbeimmediatelyvisibletothetargetdevice,andanyfilescreatedinthedevicewillbevisibletothedevelopmentPC.
TestingwiththeBeagleBoneBlackInasimilarway,youcanenterthesecommandsattheU-BootpromptoftheBeagleBoneBlack:
setenvserverip192.168.1.1
setenvipaddr192.168.1.101
setenvnpath[pathtostagingdirectory]
setenvbootargsconsole=ttyO0,115200root=/dev/nfsrwnfsroot=${serverip}:${npath}ip=${ipaddr}
fatloadmmc0:10x80200000zImage
fatloadmmc0:10x80f00000am335x-boneblack.dtb
bootz0x80200000-0x80f00000
ThereisaU-Bootenvironmentfileinchapter_05/uEnv.txt,whichcontainsallthesecommands.JustcopyittothebootpartitionofthemicroSDcardandU-Bootwilldotherest.
ProblemswithfilepermissionsThefilesthatyoucopiedintothestagingdirectorywillbeownedbytheUIDoftheuseryouareloggedonas,typically1000.However,thetargethasnoknowledgeofthisuser.Whatismore,anyfilescreatedbythetargetwillbeownedbyusersconfiguredbythetarget,oftentherootuser.Thewholethingisamess.Unfortunately,thereisnosimplewayout.ThebestsolutionistomakeacopyofthestagingdirectoryandchangeownershiptoUIDandGIDto0,usingthecommandsudochown-R0:0*.Then,exportthisdirectoryastheNFSmount.Itremovestheconvenienceofhavingjustonecopyoftherootfilesystemsharedbetweendevelopmentandtargetsystems,but,atleast,thefileownershipwillbecorrect.
UsingTFTPtoloadthekernelNowthatweknowhowtomounttherootfilesystemoveranetworkusingNFS,youmaybewonderingifthereisawaytoloadthekernel,devicetree,andinitramfsoverthenetworkaswell.Ifwecoulddothis,theonlycomponentthatneedstobewrittentostorageonthetargetisthebootloader.Everythingelsecouldbeloadedfromthehostmachine.Itwouldsavetimesinceyouwouldnotneedtokeepreflashingthetarget,andyoucouldevengetworkdonewhiletheflashstoragedriversarestillbeingdeveloped(ithappens).
TheTrivialFileTransferProtocol(TFTP)istheanswertotheproblem.TFTPisaverysimplefiletransferprotocol,designedtobeeasytoimplementinbootloaderssuchasU-Boot.
But,firstly,youneedtoinstallaTFTPdaemononyourdevelopmentmachine.OnUbuntu,youcouldinstallthetftpd-hpapackage,which,bydefault,grantsread-onlyaccesstofilesinthedirectory/var/lib/tftpboot.Withtftpd-hpainstalledandrunning,copythefilesthatyouwanttocopytothetargetinto/var/lib/tftpboot,which,fortheBeagleBoneBlack,wouldbezImageandam335x-boneblack.dtb.ThenenterthesecommandsattheU-BootCommandPrompt:
setenvserverip192.168.1.1
setenvipaddr192.168.1.101
tftpboot0x80200000zImage
tftpboot0x80f00000am335x-boneblack.dtb
setenvnpath[pathtostaging]
setenvbootargsconsole=ttyO0,115200root=/dev/nfsrwnfsroot=${serverip}:${npath}ip=${ipaddr}
bootz0x80200000-0x80f00000
Youmayfindthatthetftpbootcommandhangs,endlesslyprintingouttheletterT,whichmeansthattheTFTPrequestsaretimingout.Thereareanumberofreasonswhythishappens,themostcommononesbeing:
ThereisanincorrectIPaddressforserverip.TheTFTPdaemonisnotrunningontheserver.ThereisafirewallontheserverwhichisblockingtheTFTPprotocol.MostfirewallsdoindeedblocktheTFTPport,69,bydefault.
Onceyouhaveresolvedtheproblem,U-Bootcanloadthefilesfromthehostmachineandbootintheusualway.YoucanautomatetheprocessbyputtingthecommandsintoauEnv.txtfile.
AdditionalreadingFilesystemHierarchyStandard,Version3.0,http://refspecs.linuxfoundation.org/fhs.shtml
ramfs,rootfsandinitramfs,RobLandley,October17,2005,whichispartoftheLinuxsourceinDocumentation/filesystems/ramfs-rootfs-initramfs.txt.
SummaryOneofthestrengthsofLinuxisthatitcansupportawiderangeofrootfilesystems,andsoitcanbetailoredtosuitawiderangeofneeds.WehaveseenthatitispossibletoconstructasimplerootfilesystemmanuallywithasmallnumberofcomponentsandthatBusyBoxisespeciallyusefulinthisregard.Bygoingthroughtheprocessonestepatatime,ithasgivenusinsightintosomeofthebasicworkingsofLinuxsystems,includingnetworkconfigurationanduseraccounts.However,thetaskrapidlybecomesunmanageableasdevicesgetmorecomplex.And,thereistheever-presentworrythattheremaybeasecurityholeintheimplementation,whichwehavenotnoticed.
Inthenextchapter,IwillshowyouhowusinganembeddedbuildsystemcanmaketheprocessofcreatinganembeddedLinuxsystemmucheasierandmorereliable.IwillstartbylookingatBuildroot,andthengoontolookatthemorecomplex,butpowerful,YoctoProject.
SelectingaBuildSystemIntheprecedingchapters,wecoveredthefourelementsofembeddedLinuxandshowedyoustep-by-stephowtobuildatoolchain,abootloader,akernel,arootfilesystem,andthencombinedthemintoabasicembeddedLinuxsystem.Andtherearealotofsteps!Now,itistimetolookatwaystosimplifytheprocessbyautomatingitasmuchaspossible.Iwilllookathowembeddedbuildsystemscanhelpandlookattwooftheminparticular:BuildrootandtheYoctoProject.Botharecomplexandflexibletools,whichwouldrequireanentirebooktodescribefullyhowtheywork.Inthischapter,Ionlywanttoshowyouthegeneralideasbehindbuildsystems.Iwillshowyouhowtobuildasimpledeviceimagetogetanoverallfeelofthesystem,andthenhowtomakesomeusefulchangesusingtheNovaboardexamplefromthepreviouschapters.
Inthischapter,wewillcoverthefollowingtopics:
BuildsystemsPackageformatsandpackagemanagersBuildrootTheYoctoProject
BuildsystemsIhavedescribedtheprocessofcreatingasystemmanually,asdescribedinChapter5,BuildingaRootFilesystem,astheRollYourOwn(RYO)process.Ithastheadvantagethatyouareincompletecontrolofthesoftware,andyoucantailorittodoanythingyoulike.Ifyouwantittodosomethingtrulyoddbutinnovative,orifyouwanttoreducethememoryfootprinttothesmallestsizepossible,RYOisthewaytogo.But,inthevastmajorityofsituations,buildingmanuallyisawasteoftimeandproducesinferior,unmaintainablesystems.
TheideaofabuildsystemistoautomateallthestepsIhavedescribeduptothispoint.Abuildsystemshouldbeabletobuild,fromupstreamsourcecode,someorallofthefollowing:
AtoolchainAbootloaderAkernelArootfilesystem
Buildingfromupstreamsourcecodeisimportantforanumberofreasons.Itmeansthatyouhavepeaceofmindthatyoucanrebuildatanytime,withoutexternaldependencies.Italsomeansthatyouhavethesourcecodefordebuggingandalsothatyoucanmeetyourlicenserequirementstodistributethecodetouserswherenecessary.
Therefore,todoitsjob,abuildsystemhastobeabletodothefollowing:
1. Downloadthesourcecodefromupstream,eitherdirectlyfromthesourcecodecontrolsystemorasanarchive,andcacheitlocally.
2. Applypatchestoenablecrosscompilation,fixarchitecture-dependentbugs,applylocalconfigurationpolicies,andsoon.
3. Buildthevariouscomponents.4. Createastagingareaandassemblearootfilesystem.5. Createimagefilesinvariousformatsreadytobeloadedontothetarget.
Otherthingsthatareusefulareasfollows:
1. Addyourownpackagescontaining,forexample,applicationsorkernelchanges.
2. Selectvariousrootfilesystemprofiles:largeorsmall,withandwithoutgraphicsorotherfeatures.
3. CreateastandaloneSDKthatyoucandistributetootherdeveloperssothattheydon'thavetoinstallthecompletebuildsystem.
4. Trackwhichopensourcelicensesareusedbythevariouspackagesyouhaveselected.
5. Haveauser-friendlyuserinterface.
Inallcases,theyencapsulatethecomponentsofasystemintopackages,someforthehostandsomeforthetarget.Eachpackageisdefinedbyasetofrulestogetthesource,buildit,andinstalltheresultsinthecorrectlocation.Therearedependenciesbetweenthepackagesandabuildmechanismtoresolvethedependenciesandbuildthesetofpackagesrequired.
Opensourcebuildsystemshavematuredconsiderablyoverthelastfewyears.Therearemanyaround,includingthefollowing:
Buildroot:Thisisaneasy-to-usesystemusingGNUmakeandKconfig(https://buildroot.org/)EmbToolkit:Thisisasimplesystemforgeneratingrootfilesystems;theonlyonesofarthatsupportsLLVM/Clangoutofthebox(https://www.embtoolkit.org)OpenEmbedded:Thisisapowerfulsystem,whichisalsoacorecomponentoftheYoctoProjectandothers(http://openembedded.org)OpenWrt:Thisisabuildtoolorientedtowardsbuildingfirmwareforwirelessrouters(https://openwrt.org)PTXdist:ThisisanopensourcebuildsystemsponsoredbyPengutronix(http://www.pengutronix.de/software/ptxdist/index_en.html)TheYoctoProject:ThisextendstheOpenEmbeddedcorewithmetadata,toolsanddocumentation:probablythemostpopularsystem(http://www.yoctoproject.org)
Iwillconcentrateontwoofthese:BuildrootandtheYoctoProject.Theyapproachtheproblemindifferentwaysandwithdifferentobjectives.
Buildroothastheprimaryaimofbuildingrootfilesystemimages,hencethe
name,althoughitcanbuildbootloaderandkernelimagesaswell.Itiseasytoinstallandconfigureandgeneratestargetimagesquickly.
TheYoctoProject,ontheotherhand,ismoregeneralinthewayitdefinesthetargetsystem,andsoitcanbuildfairlycomplexembeddeddevices.Everycomponentisgeneratedasabinarypackage,bydefault,usingtheRPMformat,andthenthepackagesarecombinedtogethertomakethefilesystemimage.Furthermore,youcaninstallapackagemanagerinthefilesystemimage,whichallowsyoutoupdatepackagesatruntime.Inotherwords,whenyoubuildwiththeYoctoProject,youare,ineffect,creatingyourowncustomLinuxdistribution.
PackageformatsandpackagemanagersMainstreamLinuxdistributionsare,inmostcases,constructedfromcollectionsofbinary(precompiled)packagesineitherRPMorDEBformat.RPMstandsfortheRedHatpackagemanagerandisusedinRedHat,Suse,Fedora,andotherdistributionsbasedonthem.DebianandDebian-deriveddistributions,includingUbuntuandMint,usetheDebianpackagemanagerformat,DEB.Inaddition,thereisalight-weightformatspecifictoembeddeddevicesknownastheItsypackageformatorIPK,whichisbasedonDEB.
Theabilitytoincludeapackagemanageronthedeviceisoneofthebigdifferentiatorsbetweenbuildsystems.Onceyouhaveapackagemanageronthetargetdevice,youhaveaneasypathtodeploynewpackagestoitandtoupdatetheexistingones.IwilltalkabouttheimplicationsofthisinChapter8,UpdatingSoftwareintheField.
BuildrootTheBuildrootprojectwebsiteisathttp://buildroot.org.
ThecurrentversionsofBuildrootarecapableofbuildingatoolchain,abootloader,akernel,andarootfilesystem.ItusesGNUmakeastheprincipalbuildtool.Thereisgoodonlinedocumentationathttp://buildroot.org/docs.html,includingTheBuildrootusermanualathttps://buildroot.org/downloads/manual/manual.html.
BackgroundBuildrootwasoneofthefirstbuildsystems.ItbeganaspartoftheuClinuxanduClibcprojectsasawayofgeneratingasmallrootfilesystemfortesting.Itbecameaseparateprojectinlate2001andcontinuedtoevolvethroughto2006,afterwhichitwentintoaratherdormantphase.However,since2009,whenPeterKorsgaardtookoverstewardship,ithasbeendevelopingrapidly,addingsupportforglibcbasedtoolchainsandagreatlyincreasednumberofpackagesandtargetboards.
Asamatterofinterest,Buildrootisalsotheancestorofanotherpopularbuildsystem,OpenWrt(http://wiki.openwrt.org),whichforkedfromBuildrootaround2004.TheprimaryfocusofOpenWrtistoproducesoftwareforwirelessrouters,andsothepackagemixisorientedtowardthenetworkinginfrastructure.ItalsohasaruntimepackagemanagerusingtheIPKformatsothatadevicecanbeupdatedorupgradedwithoutacompletereflashoftheimage.However,BuildrootandOpenWrthavedivergedtosuchanextentthattheyarenowalmostcompletelydifferentbuildsystems.Packagesbuiltwithonearenotcompatiblewiththeother.
Stablereleasesandlong-termsupportTheBuildrootdevelopersproducestablereleasesfourtimesayear,inFebruary,May,August,andNovember.Theyaremarkedbygittagsoftheform:<year>.02,<year>.05,<year>.08,and<year>.11.Fromtimetotime,areleaseismarkedforLongTermSupport(LTS),whichmeansthattherewillbepointreleasestofixsecurityandotherimportantbugsfor12monthsaftertheinitialrelease.The2017.02releaseisthefirsttoreceivetheLTSlabel.
InstallingAsusual,youcaninstallBuildrooteitherbycloningtherepositoryordownloadinganarchive.Hereisanexampleofobtainingversion2017.02.1,whichwasthelateststableversionatthetimeofwriting:
$gitclonegit://git.buildroot.net/buildroot-b2017.02.1
$cdbuildroot
TheequivalentTARarchiveisavailableathttp://buildroot.org/downloads.
Next,youshouldreadthesectiontitledSystemrequirementfromTheBuildrootusermanualavailableathttp://buildroot.org/downloads/manual/manual.html,andmakesurethatyouhaveinstalledallthepackageslistedthere.
ConfiguringBuildrootusesthekernelKconfig/Kbuildmechanism,whichIdescribedinthesectionUnderstandingkernelconfigurationinChapter4,ConfiguringandBuildingtheKernel.YoucanconfigureBuildrootfromscratchdirectlyusingmakemenuconfig(xconfigorgconfig),oryoucanchooseoneofthe100+configurationsforvariousdevelopmentboardsandtheQEMUemulator,whichyoucanfindstoredinthedirectory,configs/.Typingmakelist-defconfigslistsallthedefaultconfigurations.
Let'sbeginbybuildingadefaultconfigurationthatyoucanrunontheARMQEMUemulator:
$cdbuildroot
$makeqemu_arm_versatile_defconfig
$make
Youdonottellmakehowmanyparalleljobstorunwitha-joption:BuildrootwillmakeoptimumuseofyourCPUsallbyitself.Ifyouwanttolimitthenumberofjobs,youcanrunmakemenuconfigandlookundertheBuildoptions.
Thebuildwilltakehalfanhourtoanhourormoredependingonthecapabilitiesofyourhostsystemandthespeedofyourlinktotheinternet.Itwilldownloadapproximately220MiBofcodeandwillconsumeabout3.5GiBofdiskspace.Whenitiscomplete,youwillfindthattwonewdirectorieshavebeencreated:
dl/:ThiscontainsarchivesoftheupstreamprojectsthatBuildroothasbuiltoutput/:Thiscontainsalltheintermediateandfinalcompiledresources
Youwillseethefollowinginoutput/:
build/:Here,youwillfindthebuilddirectoryforeachcomponent.host/:ThiscontainsvarioustoolsrequiredbyBuildrootthatrunonthehost,includingtheexecutablesofthetoolchain(inoutput/host/usr/bin).images/:Thisisthemostimportantofallsinceitcontainstheresultsofthebuild.Dependingonwhatyouselectedwhenconfiguring,youwillfindabootloader,akernel,andoneormorerootfilesystemimages.
staging/:Thisisasymboliclinktothesysrootofthetoolchain.Thenameofthelinkisalittleconfusing,becauseitdoesnotpointtoastagingareaasIdefineditinChapter5,BuildingaRootFilesystem.target/:Thisisthestagingareafortherootdirectory.Notethatyoucannotuseitasarootfilesystemasitstandsbecausethefileownershipandthepermissionsarenotsetcorrectly.Buildrootusesadevicetable,asdescribedinthepreviouschapter,tosetownershipandpermissionswhenthefilesystemimageiscreatedintheimage/directory.
RunningSomeofthesampleconfigurationshaveacorrespondingentryinthedirectoryboard/,whichcontainscustomconfigurationfilesandinformationaboutinstallingtheresultsonthetarget.Inthecaseofthesystemyouhavejustbuilt,therelevantfileisboard/qemu/arm-versatile/readme.txt,whichtellsyouhowtostartQEMUwiththistarget.Assumingthatyouhavealreadyinstalledqemu-system-armasdescribedinChapter1,StartingOut,youcanrunitusingthiscommand:
$qemu-system-arm-Mversatilepb-m256\
-kerneloutput/images/zImage\
-dtboutput/images/versatile-pb.dtb\
-drivefile=output/images/rootfs.ext2,if=scsi,format=raw\
-append"root=/dev/sdaconsole=ttyAMA0,115200"\
-serialstdio-netnic,model=rtl8139-netuser
ThereisascriptnamedMELP/chapter_06/run-qemu-buildroot.shinthebookcodearchive,whichincludesthatcommand.WhenQEMUbootsup,youshouldseethekernelbootmessagesappearinthesameterminalwindowwhereyoustartedQEMU,followedbyaloginprompt:
BootingLinuxonphysicalCPU0x0
Linuxversion4.9.6(chris@chris-xps)(gccversion5.4.0
(Buildroot2017.02.1))#1TueApr1810:30:03BST2017
CPU:ARM926EJ-S[41069265]revision5(ARMv5TEJ),cr=00093177
[...]
VFS:Mountedroot(ext2filesystem)readonlyondevice8:0.
devtmpfs:mounted
Freeingunusedkernelmemory:132K(c042f000-c0450000)
Thisarchitecturedoesnothavekernelmemoryprotection.
EXT4-fs(sda):warning:mountinguncheckedfs,runninge2fsckisrecommended
EXT4-fs(sda):re-mounted.Opts:block_validity,barrier,user_xattr,errors=remount-ro
Startinglogging:OK
Initializingrandomnumbergenerator...done.
Startingnetwork:8139cp0000:00:0c.0eth0:linkup,100Mbps,full-duplex,lpa0x05E1
udhcpc:started,v1.26.2
udhcpc:sendingdiscover
udhcpc:sendingselectfor10.0.2.15
udhcpc:leaseof10.0.2.15obtained,leasetime86400
deletingrouters
addingdns10.0.2.3
OK
WelcometoBuildroot
buildrootlogin:
Loginasroot,nopassword.
YouwillseethatQEMUlaunchesablackwindowinadditiontotheonewiththekernelbootmessages.Itistheretodisplaythegraphicsframebufferofthetarget.Inthiscase,thetargetneverwritestotheframebuffer,whichiswhyitappearsblack.TocloseQEMU,eithertypeCtrl-Alt-2togettotheQEMUconsoleandthentypequit,orjustclosetheframebufferwindow.
CreatingacustomBSPNext,let'suseBuildroottocreateaBSPforourNovaboardusingthesameversionsofU-BootandLinuxfromearlierchapters.YoucanseethechangesImadetoBuildrootduringthissectioninthebookcodearchiveinMELP/chapter_06/buildroot.Therecommendedplacestostoreyourchangesarehere:
board/<organization>/<device>:Thiscontainsanypatches,binaryblobs,extrabuildsteps,configurationfilesforLinux,U-Boot,andothercomponentsconfigs/<device>_defconfig:Thiscontainsthedefaultconfigurationfortheboardpackage/<organization>/<package_name>:Thisistheplacetoputanyadditionalpackagesforthisboard
Let'sbeginbycreatingadirectorytostorechangesfortheNovaboard:
$mkdir-pboard/melp/nova
Next,cleantheartifactsfromanypreviousbuild,whichyoushouldalwaysdowhenchangingconfigurations:
$makeclean
Now,selecttheconfigurationfortheBeagleBone,whichwearegoingtouseasthebasisoftheNovaconfiguration.
$makebeaglebone_defconfig
U-BootInChapter3,AllAboutBootloaders,wecreatedacustombootloaderforNova,basedonthe2017.01versionofU-Bootandcreatedapatchfileforit,whichyouwillfindinMELP/chapter_03/0001-BSP-for-Nova.patch.WecanconfigureBuildroottoselectthesameversionandapplyourpatch.Beginbycopyingthepatchfileintoboard/melp/nova,andthenusemakemenuconfigtosettheU-Bootversionto2017.01,thepatchfiletoboard/melp/nova/0001-BSP-for-Nova.patch,andtheboardnametoNova,asshowninthisscreenshot:
WealsoneedaU-BootscripttoloadtheNovadevicetreeandthekernelfromtheSDcard.Wecanputthefileintoboard/melp/nova/uEnv.txt.Itshouldcontainthesecommands:
bootpart=0:1
bootdir=
bootargs=console=ttyO0,115200n8root=/dev/mmcblk0p2rwrootfstype=ext4rootwait
uenvcmd=fatloadmmc0:188000000nova.dtb;fatloadmmc0:182000000zImage;
bootz82000000-88000000
LinuxInChapter4,ConfiguringandBuildingtheKernel,webasedthekernelonLinux4.9.13andsuppliedanewdevicetree,whichisinMELP/chapter_04/nova.dts.Copythedevicetreetoboard/melp/nova,changetheBuildrootkernelconfigurationtoselectLinuxversion4.9.13,andthedevicetreesourcetoboard/melp/nova/nova.dts,asshowninthefollowingscreenshot:
Wewillalsohavetochangethekernelseriestobeusedforkernelheaderstomatchthekernelbeingbuilt:
BuildInthelaststageofthebuild,BuildrootusesatoolnamedgenimagetocreateanimagefortheSDcardthatwecancopydirectorytothecard.Weneedaconfigurationfiletolayouttheimageintherightway.Wewillnamethefileboard/melp/nova/genimage.cfgandpopulateitasshownhere:
imageboot.vfat{
vfat{
files={
"MLO",
"u-boot.img",
"zImage",
"uEnv.txt",
"nova.dtb",
}
}
size=16M
}
imagesdcard.img{
hdimage{
}
partitionu-boot{
partition-type=0xC
bootable="true"
image="boot.vfat"
}
partitionrootfs{
partition-type=0x83
image="rootfs.ext4"
size=512M
}
}
Thiswillcreateafilenamedsdcard.img,whichcontainstwopartitionsnamedu-bootandrootfs.Thefirstcontainsthebootfileslistedinboot.vfat,andthesecondcontainstherootfilesystemimagenamedrootfs.ext4,whichwillbegeneratedbyBuildroot.
Finally,weneedtocreateapostimagescriptthatwillcallgenimage,andsocreatetheSDcardimage.Wewillputitinboard/melp/nova/post-image.sh:
#!/bin/sh
BOARD_DIR="$(dirname$0)"
cp${BOARD_DIR}/uEnv.txt$BINARIES_DIR/uEnv.txt
GENIMAGE_CFG="${BOARD_DIR}/genimage.cfg"
GENIMAGE_TMP="${BUILD_DIR}/genimage.tmp"
rm-rf"${GENIMAGE_TMP}"
genimage\
--rootpath"${TARGET_DIR}"\
--tmppath"${GENIMAGE_TMP}"\
--inputpath"${BINARIES_DIR}"\
--outputpath"${BINARIES_DIR}"\
--config"${GENIMAGE_CFG}"
ThiscopiestheuEnv.txtscriptintotheoutput/imagesdirectoryandrunsgenimagewithourconfigurationfile.
Now,wecanrunmenuconfigagainandtochangetheSystemconfigurationoption,Customscriptstorunbeforecreatingfilesystemimages,torunourpost-image.shscript,asshowninthisscreenshot:
Finally,youcanbuildLinuxfortheNovaboardjustbytypingmake.Whenithasfinished,youwillseethesefilesinthedirectory,output/images/:
boot.vfatrootfs.ext2sdcard.imguEnv.txt
MLOrootfs.ext4u-boot.imgzImage
nova.dtbrootfs.taru-boot-spl.bin
Totestit,putamicroSDcardinthecardreader,unmountanypartitionsthatareautomounted,andthencopysdcard.imgtotherootoftheSDcard.Thereisnoneedtoformatitbeforehand,aswedidinthepreviouschapter,becausegenimagehascreatedtheexactdisklayoutrequired.Inthefollowingexample,mySDcardreaderis/dev/mmcblk0:
$sudoumount/dev/mmcblk0*
$sudoddif=output/images/sdcard.imgof=/dev/mmcblk0bs=1M
PuttheSDcardintotheBeagleBoneBlackandpoweronwhilepressingthebootbuttontoforceittoloadfromtheSDcard.YoushouldseethatitbootsupwithourselectedversionsofU-Boot,Linux,andwiththeNovadevicetree.
HavingshownthatourcustomconfigurationfortheNovaboardworks,itwouldbenicetokeepacopyoftheconfigurationsothatyouandotherscanuseitagain,whichyoucandowiththiscommand:
$makesavedefconfigBR2_DEFCONFIG=configs/nova_defconfig
Now,youhaveaBuildrootconfigurationfortheNovaboard.Subsequently,youcanretrievethisconfigurationbytypingthefollowingcommand:
$makenova_defconfig
AddingyourowncodeSupposethereisaprogramthatyouhavedevelopedandthatyouwanttoincludeitinthebuild.Youhavetwooptions:firstlytobuilditseparatelyusingitsownbuildsystem,andthenrollthebinaryintothefinalbuildasanoverlay.Secondly,youcouldcreateaBuildrootpackagethatcanbeselectedfromthemenuandbuiltlikeanyother.
OverlaysAnoverlayissimplyadirectorystructurethatiscopiedoverthetopoftheBuildrootrootfilesystematalatestageinthebuildprocess.Itcancontainexecutables,libraries,andanythingelseyoumaywanttoinclude.Notethatanycompiledcodemustbecompatiblewiththelibrariesdeployedatruntime,which,inturn,meansthatitmustbecompiledwiththesametoolchainthatBuildrootuses.UsingtheBuildroottoolchainisquiteeasy.JustaddittoPATH:
$PATH=<path_to_buildroot>/output/host/usr/bin:$PATH
Theprefixforthetoolchainis<ARCH>-linux-.So,tocompileasimpleprogram,youwoulddosomethinglikethis:
$PATH=/home/chris/buildroot/output/host/usr/bin:$PATH
$arm-linux-gcchelloworld.c-ohelloworld
Onceyouhavecompiledyourprogramwiththecorrecttoolchain,youjustneedtoinstalltheexecutablesandothersupportingfilesintoastagingarea,andmarkitasanoverlayforBuildroot.Forthehelloworldexample,youmightputitintheboard/melp/novadirectory:
$mkdir-pboard/melp/nova/overlay/usr/bin
$cphelloworldboard/melp/nova/overlay/usr/bin
Finally,yousetBR2_ROOTFS_OVERLAYtothepathtotheoverlay.Itcanbeconfiguredinmenuconfigwiththeoption,Systemconfiguration|Rootfilesystemoverlaydirectories.
AddingapackageBuildrootpackagesarestoredinthepackagedirectory,over2,000ofthem,eachinitsownsubdirectory.Apackageconsistsofatleasttwofiles:Config.in,containingthesnippetofKconfigcoderequiredtomakethepackagevisibleintheconfigurationmenu,andamakefilenamed<package_name>.mk.Notethatthepackagedoesnotcontainthecode,justtheinstructionstogetthecodebydownloadingatarball,doinggitpullorwhateverisnecessarytoobtaintheupstreamsource.
ThemakefileiswritteninaformatexpectedbyBuildrootandcontainsdirectivesthatallowBuildroottodownload,configure,compile,andinstalltheprogram.Writinganewpackagemakefileisacomplexoperation,whichiscoveredindetailintheBuildrootusermanual.Hereisanexamplewhichshowsyouhowtocreateapackageforasimpleprogramstoredlocally,suchasourhelloworldprogram.
Beginbycreatingthepackage/helloworld/subdirectorywithaconfigurationfile,Config.in,whichlookslikethis:
configBR2_PACKAGE_HELLOWORLD
bool"helloworld"
help
AfriendlyprogramthatprintsHelloWorld!every10s
Thefirstlinemustbeoftheformat,BR2_PACKAGE_<uppercasepackagename>.ThisisfollowedbyaBooleanandthepackagename,asitwillappearintheconfigurationmenu,whichwillallowausertoselectthispackage.Thehelpsectionisoptional(buthopefullyuseful).
Next,linkthenewpackageintotheTargetPackagesmenubyeditingpackage/Config.inandsourcingtheconfigurationfileasmentionedintheprecedingsection.Youcouldappendthistoanexistingsubmenubut,inthiscase,itseemsneatertocreateanewsubmenu,whichonlycontainsourpackage:
menu"Myprograms"
source"package/helloworld/Config.in"
endmenu
Then,createamakefile,package/helloworld/helloworld.mk,tosupplythedataneededbyBuildroot:
HELLOWORLD_VERSION=1.0.0
HELLOWORLD_SITE=/home/chris/MELP/helloworld
HELLOWORLD_SITE_METHOD=local
defineHELLOWORLD_BUILD_CMDS
$(MAKE)CC="$(TARGET_CC)"LD="$(TARGET_LD)"-C$(@D)all
endef
defineHELLOWORLD_INSTALL_TARGET_CMDS
$(INSTALL)-D-m0755$(@D)/helloworld$(TARGET_DIR)/usr/bin/helloworld
endef
$(eval$(generic-package))
YoucanfindmyhelloworldpackageinthebookcodearchiveinMELP/chapter_06/buildroot/package/helloworldandthesourcecodefortheprograminMELP/chapter_06/helloworld.Thelocationofthecodeishardcodedtoalocalpathname.Inamorerealisticcase,youwouldgetthecodefromasourcecodesystemorfromacentralserverofsomekind:therearedetailsofhowtodothisintheBuildrootusermanualandplentyofexamplesinotherpackages.
LicensecomplianceBuildrootisbasedonanopensourcesoftwareasarethepackagesitcompiles.Atsomepointduringtheproject,youshouldcheckthelicenses,whichyoucandobyrunning:
$makelegal-info
Theinformationisgatheredintooutput/legal-info/.Therearesummariesofthelicensesusedtocompilethehosttoolsinhost-manifest.csvand,onthetarget,inmanifest.csv.ThereismoreinformationintheREADMEfileandintheBuildrootusermanual.
TheYoctoProjectTheYoctoProjectisamorecomplexbeastthanBuildroot.Notonlycanitbuildtoolchains,bootloaders,kernels,androotfilesystemsasBuildrootcan,butitcangenerateanentireLinuxdistributionforyouwithbinarypackagesthatcanbeinstalledatruntime.TheYoctoProjectisprimarilyagroupofrecipes,similartoBuildrootpackagesbutwrittenusingacombinationofPythonandshellscript,togetherwithataskschedulercalledBitBakethatproduceswhateveryouhaveconfigured,fromtherecipes.
Thereisplentyofonlinedocumentationathttps://www.yoctoproject.org/.
BackgroundThestructureoftheYoctoProjectmakesmoresenseifyoulookatthebackgroundfirst.It'srootsareinOpenEmbedded,http://openembedded.org/,which,inturn,grewoutofanumberofprojectstoportLinuxtovarioushand-heldcomputers,includingtheSharpZaurusandtheCompaqiPaq.OpenEmbedded,whichcametolifein2003asthebuildsystemforthosehand-heldcomputers.Soonafter,otherdevelopersbegantouseitasageneralbuildasystemfordevicesrunningembeddedLinux.Itwasdeveloped,andcontinuestobedeveloped,byanenthusiasticcommunityofprogrammers.
TheOpenEmbeddedprojectissetouttocreateasetofbinarypackagesusingthecompactIPKformat,whichcouldthenbecombinedinvariouswaystocreateatargetsystemandbeinstalledonthetargetatruntime.ItdidthisbycreatingrecipesforeachpackageandusingBitBakeasthetaskscheduler.Itwas,andis,veryflexible.Bysupplyingtherightmetadata,youcancreateanentireLinuxdistributiontoyourownspecification.Onethat,whichisfairlywell-knownistheÅngströmDistribution,http://www.angstrom-distribution.org,buttherearemanyothersaswell.
Atsometimein2005,RichardPurdie,thenadeveloperatOpenedHand,createdaforkofOpenEmbedded,whichhadamoreconservativechoiceofpackagesandcreatedreleasesthatwerestableoveraperiodoftime.HenameditPokyaftertheJapanesesnack(ifyouareworriedaboutthesethings,Pokyispronouncedtorhymewithhockey).AlthoughPokywasafork,OpenEmbeddedandPokycontinuedtorunalongsideeachother,sharingupdatesandkeepingthearchitecturesmoreorlessinstep.IntelbroughtoutOpenedHandin2008,andtheytransferredPokyLinuxtotheLinuxFoundationin2010whentheyformedtheYoctoProject.
Since2010,thecommoncomponentsofOpenEmbeddedandPokyhavebeencombinedintoaseparateprojectknownasOpenEmbeddedCoreorjustOE-Core.
Therefore,theYoctoProjectcollectstogetherseveralcomponents,themost
importantofwhicharethefollowing:
OE-Core:Thisisthecoremetadata,whichissharedwithOpenEmbeddedBitBake:Thisisthetaskscheduler,whichissharedwithOpenEmbeddedandotherprojectsPoky:ThisisthereferencedistributionDocumentation:Thisistheuser'smanualsanddeveloper'sguidesforeachcomponentToaster:Thisisaweb-basedinterfacetoBitBakeanditsmetadataADTEclipse:ThisisapluginforEclipse
TheYoctoProjectprovidesastablebase,whichcanbeusedasitisorcanbeextendedusingmetalayers,whichIwilldiscusslaterinthischapter.ManySoCvendorsprovideBSPsfortheirdevicesinthisway.Metalayerscanalsobeusedtocreateextendedorjustdifferentbuildsystems.Someareopensource,suchastheÅngströmDistribution,andothersarecommercial,suchasMontaVistaCarrierGradeEdition,MentorEmbeddedLinux,andWindRiverLinux.TheYoctoProjecthasabrandingandcompatibilitytestingschemetoensurethatthereisinteroperabilitybetweencomponents.YouwillseestatementslikeYoctoProjectcompatibleonvariouswebpages.
Consequently,youshouldthinkoftheYoctoProjectasthefoundationofawholesectorofembeddedLinux,aswellasbeingacompletebuildsysteminitsownright.
Youmaybewonderingaboutthename,Yocto.yoctoistheSIprefixfor10-24,inthesamewaythatmicrois10-6.WhynametheprojectYocto?ItwaspartlytoindicatethatitcouldbuildverysmallLinuxsystems(although,tobefair,socanotherbuildsystems),butalsotostealamarchontheÅngströmDistribution,whichisbasedonOpenEmbedded.AnÅngströmis10-10.That'shuge,comparedtoayocto!
StablereleasesandsupportsUsually,thereisareleaseoftheYoctoProjecteverysixmonths:inAprilandOctober.Theyareprincipallyknownbythecodename,butitisusefultoknowtheversionnumbersoftheYoctoProjectandPokyaswell.Hereisatableofthesixmostrecentreleasesatthetimeofwriting:
Codename Releasedate Yoctoversion Pokyversion
Morty October2016 2.2 16
Krogoth April2016 2.1 15Jethro October2015 2.0 14Fido April2015 1.8 13Dizzy October2014 1.7 12Daisy April2014 1.6 11
Thestablereleasesaresupportedwithsecurityandcriticalbugfixesforthecurrentreleasecycleandthenextcycle.Inotherwords,eachversionissupportedforapproximately12monthsaftertherelease.AswithBuildroot,ifyouwantcontinuedsupport,youcanupdatetothenextstablerelease,oryoucanbackportchangestoyourversion.YoualsohavetheoptionofcommercialsupportforperiodsofseveralyearswiththeYoctoProjectfromoperatingsystemvendors,suchasMentorGraphics,WindRiver,andmanyothers.
InstallingtheYoctoProjectTogetacopyoftheYoctoProject,youcaneitherclonetherepository,choosingthecodenameasthebranch,whichismortyinthiscase:
$gitclone-bmortygit://git.yoctoproject.org/poky.git
Youcanalsodownloadthearchivefromhttp://downloads.yoctoproject.org/releases/yocto/yocto-2.2/poky-morty-16.0.0.tar.bz2.Inthefirstcase,youwillfindeverythinginthedirectory,poky/,inthesecondcase,poky-morty-16.0.0/.
Inaddition,youshouldreadthesectiontitledSystemRequirementsfromtheYoctoProjectReferenceManual(http://www.yoctoproject.org/docs/current/ref-manual/ref-manual.html#detailed-supported-distros);and,inparticular,youshouldmakesurethatthepackageslistedthereareinstalledonyourhostcomputer.
ConfiguringAswithBuildroot,let'sbeginwithabuildfortheQEMUARMemulator.Beginbysourcingascripttosetuptheenvironment:
$cdpoky
$sourceoe-init-build-env
Thiscreatesaworkingdirectoryforyounamedbuild/andmakesitthecurrentdirectory.Alloftheconfiguration,intermediate,andtargetimagefileswillbeputinthisdirectory.Youmustsourcethisscripteachtimeyouwanttoworkonthisproject.
Youcanchooseadifferentworkingdirectorybyaddingitasaparametertooe-init-build-env,forexample:
$sourceoe-init-build-envbuild-qemuarm
Thiswillputyouintothedirectory:build-qemuarm/.Thiswayyoucanhaveseveralbuilddirectories,eachforadifferentproject:youchoosewhichoneyouwanttoworkwiththroughtheparametertooe-init-build-env.
Initially,thebuilddirectorycontainsonlyonesubdirectorynamedconf/,whichcontainstheconfigurationfilesforthisproject:
local.conf:Thiscontainsaspecificationofthedeviceyouaregoingtobuildandthebuildenvironment.bblayers.conf:Thiscontainspathsofthemetalayersyouaregoingtouse.Iwilldescribelayerslateron.templateconf.cfg:Thiscontainsthenameofadirectory,whichcontainsvariousconffiles.Bydefault,itpointstometa-poky/conf/.
Fornow,wejustneedtosettheMACHINEvariableinlocal.conftoqemuarmbyremovingthecommentcharacter(#)atthestartofthisline:
MACHINE?="qemuarm"
BuildingToactuallyperformthebuild,youneedtorunBitBake,tellingitwhichrootfilesystemimageyouwanttocreate.Somecommonimagesareasfollows:
core-image-minimal:Thisisasmallconsole-basedsystemwhichisusefulfortestsandasthebasisforcustomimages.core-image-minimal-initramfs:Thisissimilartocore-image-minimal,butbuiltasaramdisk.core-image-x11:ThisisabasicimagewithsupportforgraphicsthroughanX11serverandthexterminalterminalapp.core-image-sato:ThisisafullgraphicalsystembasedonSato,whichisamobilegraphicalenvironmentbuiltonX11,andGNOME.TheimageincludesseveralappsincludingaTerminal,aneditor,andafilemanager.
BygivingBitBakethefinaltarget,itwillworkbackwardsandbuildallthedependenciesfirst,beginningwiththetoolchain.Fornow,wejustwanttocreateaminimalimagetoseehowitworks:
$bitbakecore-image-minimal
Thebuildislikelytotakesometime,probablymorethananhour.Itwilldownloadabout4GiBofsourcecode,anditwillconsumeaboutabout24GiBofdiskspace.Whenitiscomplete,youwillfindseveralnewdirectoriesinthebuilddirectoryincludingdownloads/,whichcontainsallthesourcedownloadedforthebuild,andtmp/,whichcontainsmostofthebuildartifacts.Youshouldseethefollowingintmp/:
work/:Thiscontainsthebuilddirectoryandthestagingareafortherootfilesystem.deploy/:Thiscontainsthefinalbinariestobedeployedonthetarget:
deploy/images/[machinename]/:Containsthebootloader,thekernel,andtherootfilesystemimagesreadytoberunonthetargetdeploy/rpm/:ThiscontainstheRPMpackagesthatwenttomakeuptheimagesdeploy/licenses/:Thiscontainsthelicensefilesextractedfrom
eachpackage
RunningtheQEMUtargetWhenyoubuildaQEMUtarget,aninternalversionofQEMUisgenerated,whichremovestheneedtoinstalltheQEMUpackageforyourdistribution,andthusavoidsversiondependencies.ThereisawrapperscriptnamedrunqemutorunthisversionofQEMU.
ToruntheQEMUemulation,makesurethatyouhavesourcedoe-init-build-env,andthenjusttypethis:
$runqemuqemuarm
Inthiscase,QEMUhasbeenconfiguredwithagraphicconsolesothatthebootmessagesandloginpromptappearintheblackframebuffer,asshowninthefollowingscreenshot:
Youcanloginasroot,withoutapassword.YoucanclosedownQEMUbyclosingtheframebufferwindow.
YoucanlaunchQEMUwithoutthegraphicwindowbyaddingnographictothecommandline:
$runqemuqemuarmnographic
Inthiscase,youcloseQEMUusingthekeysequenceCtrl+Aandthenx.
Therunqemuscripthasmanyotheroptions.Typerunqemuhelpformoreinformation.
LayersThemetadatafortheYoctoProjectisstructuredintolayers.Byconvention,eachlayerhasanamebeginningwithmeta.ThecorelayersoftheYoctoProjectareasfollows:
meta:ThisistheOpenEmbeddedcorewithsomechangesforPokymeta-poky:ThisisthemetadataspecifictothePokydistributionmeta-yocto-bsp:ThiscontainstheboardsupportpackagesforthemachinesthattheYoctoProjectsupports
ThelistoflayersinwhichBitBakesearchesforrecipesisstoredin<yourbuilddirectory>/conf/bblayers.confand,bydefault,includesallthreelayersmentionedintheprecedinglist.
Bystructuringtherecipesandotherconfigurationdatainthisway,itisveryeasytoextendtheYoctoProjectbyaddingnewlayers.AdditionallayersareavailablefromSoCmanufacturers,theYoctoProjectitself,andawiderangeofpeoplewishingtoaddvaluetotheYoctoProjectandOpenEmbedded.Thereisausefullistoflayersathttp://layers.openembedded.org/layerindex/branch/master/layers/.Herearesomeexamples:
meta-angstrom:TheÅngströmdistributionmeta-qt5:Qt5librariesandutilitiesmeta-intel:BSPsforIntelCPUsandSoCsmeta-ti:BSPsforTIARM-basedSoCs
Addingalayerisassimpleascopyingthemetadirectoryintoasuitablelocation,usuallyalongsidethedefaultmetalayersandaddingittobblayers.conf.MakesurethatyoureadtheREAMDEfilethatshouldaccompanyeachlayertoseewhatdependenciesithasonotherlayersandwhichversionsoftheYoctoProjectitiscompatiblewith.
Toillustratethewaythatlayerswork,let'screatealayerforourNovaboard,whichwecanusefortheremainderofthechapterasweaddfeatures.Youcanseethecompleteimplementationofthelayerinthecodearchivein
MELP/chapter_06/poky/meta-nova.
Eachmetalayerhastohaveatleastoneconfigurationfile,namedconf/layer.conf,anditshouldalsohavetheREADMEfileandalicense.Thereisahandyhelperscriptthatdoesthebasicsforus:
$cdpoky
$scripts/yocto-layercreatenova
Thescriptasksforapriority,andwhetheryouwanttocreatesamplerecipes.Intheexamplehere,Ijustacceptedthedefaults:
Pleaseenterthelayerpriorityyou'dliketouseforthelayer:
[default:6]
Wouldyouliketohaveanexamplerecipecreated?(y/n)[default:n]
Wouldyouliketohaveanexamplebbappendfilecreated?(y/n)
[default:n]
Newlayercreatedinmeta-nova.
Don'tforgettoaddittoyourBBLAYERS(fordetailssee
meta-nova/README).
Thiswillcreatealayernamedmeta-novawithaconf/layer.conf,anoutlineREADMEandanMITLICENSEinCOPYING.MIT.Thelayer.conffilelookslikethis:
#Wehaveaconfandclassesdirectory,addtoBBPATH
BBPATH.=":${LAYERDIR}"
#Wehaverecipes-*directories,addtoBBFILES
BBFILES+="${LAYERDIR}/recipes-*/*/*.bb
${LAYERDIR}/recipes-*/*/*.bbappend"
BBFILE_COLLECTIONS+="nova"
BBFILE_PATTERN_nova="^${LAYERDIR}/"
BBFILE_PRIORITY_nova="6"
ItaddsitselftoBBPATHandtherecipesitcontainstoBBFILES.Fromlookingatthecode,youcanseethattherecipesarefoundinthedirectorieswithnamesbeginningrecipes-andhavefilenamesendingin.bb(fornormalBitBakerecipes)or.bbappend(forrecipesthatextendexistingrecipesbyoverridingoraddingtotheinstructions).ThislayerhasthenamenovaaddedtothelistoflayersinBBFILE_COLLECTIONSandhasapriorityof6.Thelayerpriorityisusedifthesamerecipeappearsinseverallayers:theoneinthelayerwiththehighestprioritywins.
Sinceyouareabouttobuildanewconfiguration,itisbesttobeginbycreatinganewbuilddirectorynamedbuild-nova:
$cd~/poky
$sourceoe-init-build-envbuild-nova
Now,youneedtoaddthislayertoyourbuildconfigurationusingthecommand:
$bitbake-layersadd-layer../meta-nova
Youcanconfirmthatitissetupcorrectlylikethis:
$bitbake-layersshow-layers
layerpathpriority
==========================================================
meta/home/chris/poky/meta5
meta-yocto/home/chris/poky/meta-yocto5
meta-yocto-bsp/home/chris/poky/meta-poky-bsp5
meta-nova/home/chris/poky/meta-nova6
Thereyoucanseethenewlayer.Ithasapriority6,whichmeansthatwecouldoverriderecipesintheotherlayers,whichallhavealowerpriority.
Atthispoint,itwouldbeagoodideatorunabuild,usingthisemptylayer.ThefinaltargetwillbetheNovaboardbut,fornow,buildforaBeagleBoneBlackbyremovingthecommentbeforeMACHINE?="beaglebone"inconf/local.conf.Then,buildasmallimageusingbitbakecore-image-minimalasbefore.
Aswellasrecipes,layersmaycontainBitBakeclasses,configurationfilesformachines,distributions,andmore.Iwilllookatrecipesnextandshowyouhowtocreateacustomizedimageandhowtocreateapackage.
BitBakeandrecipesBitBakeprocessesmetadataofseveraldifferenttypes,whichincludethefollowing:
Recipes:Filesendingin.bb.Thesecontaininformationaboutbuildingaunitofsoftware,includinghowtogetacopyofthesourcecode,thedependenciesonothercomponents,andhowtobuildandinstallit.Append:Filesendingin.bbappend.Theseallowsomedetailsofarecipetobeoverriddenorextended.Abbappendfilesimplyappendsitsinstructionstotheendofarecipe(.bb)fileofthesamerootname.Include:Filesendingin.inc.Thesecontaininformationthatiscommontoseveralrecipes,allowinginformationtobesharedamongthem.Thefilesmaybeincludedusingtheincludeorrequirekeywords.Thedifferenceisthatrequireproducesanerrorifthefiledoesnotexist,whereasincludedoesnot.Classes:Filesendingin.bbclass.Thesecontaincommonbuildinformation,forexample,howtobuildakernelorhowtobuildanautotoolsproject.Theclassesareinheritedandextendedinrecipesandotherclassesusingtheinheritkeyword.Theclassclasses/base.bbclassisimplicitlyinheritedineveryrecipe.Configuration:Filesendingin.conf.Theydefinevariousconfigurationvariablesthatgoverntheproject'sbuildprocess.
ArecipeisacollectionoftaskswritteninacombinationofPythonandshellscript.Thetaskshavenamessuchasdo_fetch,do_unpack,do_patch,do_configure,do_compile,anddo_install.YouuseBitBaketoexecutethesetasks.Thedefaulttaskisdo_build,whichperformsallthesubtasksrequiredtobuildtherecipe.Youcanlistthetasksavailableinarecipeusingbitbake-clisttasks[recipe].Forexample,youcanlistthetasksincore-image-minimallikethis:
$bitbake-clisttaskscore-image-minimal
[...]
core-image-minimal-1.0-r0do_listtasks:do_build
core-image-minimal-1.0-r0do_listtasks:do_bundle_initramfs
core-image-minimal-1.0-r0do_listtasks:do_checkuri
core-image-minimal-1.0-r0do_listtasks:do_checkuriall
core-image-minimal-1.0-r0do_listtasks:do_clean
[...]
Infact,-cistheoptionthattellsBitBaketorunaspecifictaskinarecipewiththetaskbeingnamedwiththedo_partstrippedoff.Thetaskdo_listtasksissimplyaspecialtaskthatlistsallthetasksdefinedwithinarecipe.Anotherexampleisthefetchtask,whichdownloadsthesourcecodeforarecipe:
$bitbake-cfetchbusybox
Youcanalsousethefetchalltasktogetthecodeforthetargetandallthedependencies,whichisusefulifyouwanttomakesureyouhavedownloadedallthecodefortheimageyouareabouttobuild:
$bitbake-cfetchallcore-image-minimal
Therecipefilesareareusuallynamed<package-name>_<version>.bb.Theymayhavedependenciesonotherrecipes,whichwouldallowBitBaketoworkoutallthesubtasksthatneedtobeexecutedtocompletethetopleveljob.
Asanexample,tocreatearecipeforourhelloworldprograminmeta-nova,youwouldcreateadirectorystructurelikethis:
meta-nova/recipes-local/helloworld
├──files
│└──helloworld.c
└──helloworld_1.0.bb
Therecipeishelloworld_1.0.bbandthesourceislocaltotherecipedirectoryinthesubdirectoryfiles/.Therecipecontainstheseinstructions:
DESCRIPTION="AfriendlyprogramthatprintsHelloWorld!"
PRIORITY="optional"
SECTION="examples"
LICENSE="GPLv2"
LIC_FILES_CHKSUM="file://${COMMON_LICENSE_DIR}/GPL-2.0;
md5=801f80980d171dd6425610833a22dbe6"
SRC_URI="file://helloworld.c"
S="${WORKDIR}"
do_compile(){
${CC}${CFLAGS}${LDFLAGS}helloworld.c-ohelloworld
}
do_install(){
install-d${D}${bindir}
install-m0755helloworld${D}${bindir}
}
ThelocationofthesourcecodeissetbySRC_URI:.Inthiscase,thefile://URImeansthatthecodeislocaltotherecipedirectory.BitBakewillsearchdirectories,files/,helloworld/,andhelloworld-1.0/relativetothedirectorythatcontainstherecipe.Thetasksthatneedtobedefinedaredo_compileanddo_install,whichcompiletheonesourcefileandinstallitintothetargetrootfilesystem:${D}expandstothestagingareaoftherecipeand${bindir}tothedefaultbinarydirectory,/usr/bin.
Everyrecipehasalicense,definedbyLICENSE,whichissettoGPLV2here.ThefilecontainingthetextofthelicenseandachecksumisdefinedbyLIC_FILES_CHKSUM.BitBakewillterminatethebuildifthechecksumdoesnotmatch,indicatingthatthelicensehaschangedinsomeway.Thelicensefilemaybepartofthepackageoritmaypointtooneofthestandardlicensetextsinmeta/files/common-licenses/,asisthecasehere.
Bydefault,commerciallicensesaredisallowed,butitiseasytoenablethem.Youneedtospecifythelicenseintherecipe,asshownhere:
LICENSE_FLAGS="commercial"
Then,inyourconf/local.conf,youwouldexplicitlyallowthislicense,likeso:
LICENSE_FLAGS_WHITELIST="commercial"
Now,tomakesurethatourhelloworldrecipecompilescorrectly,youcanaskBitBaketobuildit,likeso:
$bitbakehelloworld
Ifallgoeswell,youshouldseethatithascreatedaworkingdirectoryforitintmp/work/cortexa8hf-vfp-neon-poky-linux-gnueabi/helloworld/.YoushouldalsoseethereisanRPMpackageforitintmp/deploy/rpm/cortexa8hf_vfp_neon/helloworld-1.0-r0.cortexa8hf_vfp_neon.rpm.
Itisnotpartofthetargetimageyet,though.ThelistofpackagestobeinstalledisheldinavariablenamedIMAGE_INSTALL.Youcanappendtotheendofthatlistbyaddingthislinetoconf/local.conf:
IMAGE_INSTALL_append="helloworld"
Notethattherehastobeaspacebetweentheopeningdoublequoteandthefirst
packagename.Now,thepackagewillbeaddedtoanyimagethatyoubitbake:
$bitbakecore-image-minimal
Ifyoulookintmp/deploy/images/beaglebone/core-image-minimal-beaglebone.tar.bz2,youwillseethat/usr/bin/helloworldhasindeedbeeninstalled.
Customizingimagesvialocal.confYoumayoftenwanttoaddapackagetoanimageduringdevelopmentortweakitinotherways.Asshownpreviously,youcansimplyappendtothelistofpackagestobeinstalledbyaddingastatementlikethis:
IMAGE_INSTALL_append="stracehelloworld"
YoucanmakemoresweepingchangesviaEXTRA_IMAGE_FEATURES.Hereisashortlistwhichshouldgiveyouanideaofthefeaturesyoucanenable:
dbg-pkgs:Thisinstallsdebugsymbolpackagesforallthepackagesinstalledintheimage.debug-tweaks:Thisallowsrootloginswithoutpasswordsandotherchangesthatmakedevelopmenteasier.package-management:Thisinstallspackagemanagementtoolsandpreservesthepackagemanagerdatabase.read-only-rootfs:Thismakestherootfilesystemread-only.WewillcoverthisinmoredetailinChapter7,CreatingaStorageStrategy.x11:ThisinstallstheXserver.x11-base:ThisinstallstheXserverwithaminimalenvironment.x11-sato:ThisinstallstheOpenedHandSatoenvironment.
Therearemanymorefeaturesthatyoucanaddinthisway.IrecommendyoulookattheImageFeaturessectionoftheYoctoProjectReferenceManualandalsoreadthroughthecodeinmeta/classes/core-image.bbclass.
WritinganimagerecipeTheproblemwithmakingchangestolocal.confisthattheyare,well,local.Ifyouwanttocreateanimagethatistobesharedwithotherdevelopersortobeloadedontoaproductionsystem,thenyoushouldputthechangesintoanimagerecipe.
Animagerecipecontainsinstructionsabouthowtocreatetheimagefilesforatarget,includingthebootloader,thekernel,andtherootfilesystemimages.Byconvention,imagerecipesareputintoadirectorynamedimages,soyoucangetalistofalltheimagesthatareavailablebyusingthiscommand:
$lsmeta*/recipes*/images/*.bb
Youwillfindthattherecipeforcore-image-minimalisinmeta/recipes-core/images/core-image-minimal.bb.
Asimpleapproachistotakeanexistingimagerecipeandmodifyitusingstatementssimilartothoseyouusedinlocal.conf.
Forexample,imaginethatyouwantanimagethatisthesameascore-image-minimalbutincludesyourhelloworldprogramandthestraceutility.Youcandothatwithatwo-linerecipefile,whichincludes(usingtherequirekeyword)thebaseimageandaddsthepackagesyouwant.Itisconventionaltoputtheimageinadirectorynamedimages,soaddtherecipenova-image.bbwiththiscontentinmeta-nova/recipes-local/images:
requirerecipes-core/images/core-image-minimal.bb
IMAGE_INSTALL+="helloworldstrace"
Now,youcanremovetheIMAGE_INSTALL_appendlinefromyourlocal.confandbuilditusingthis:
$bitbakenova-image
CreatinganSDKItisveryusefultobeabletocreateastandalonetoolchainthatotherdeveloperscaninstall,avoidingtheneedforeveryoneintheteamtohaveafullinstallationoftheYoctoProject.Ideally,youwantthetoolchaintoincludedevelopmentlibrariesandheaderfilesforallthelibrariesinstalledonthetarget.Youcandothatforanyimageusingthepopulate_sdktask,asshownhere:
$bitbake-cpopulate_sdknova-image
Theresultisaself-installingshellscriptintmp/deploy/sdk:
poky-<c_library>-<host_machine>-<target_image><target_machine>
-toolchain-<version>.sh
FortheSDKbuiltwiththenova-imagerecipe,itisthis:
poky-glibc-x86_64-nova-image-cortexa8hf-neon-toolchain-2.2.1.sh
IfyouonlywantabasictoolchainwithjustCandC++crosscompilers,theC-libraryandheaderfiles,youcaninsteadrunthis:
$bitbakemeta-toolchain
ToinstalltheSDK,justruntheshellscript.Thedefaultinstalldirectoryis/opt/poky,buttheinstallscriptallowsyoutochangethis:
$tmp/deploy/sdk/poky-glibc-x86_64-nova-image-cortexa8hf-neon-
toolchain-2.2.1.sh
Poky(YoctoProjectReferenceDistro)SDKinstallerversion2.2.1
=================================================================
EntertargetdirectoryforSDK(default:/opt/poky/2.2.1):
YouareabouttoinstalltheSDKto"/opt/poky/2.2.1".Proceed[Y/n]?
[sudo]passwordforchris:
ExtractingSDK...........................done
Settingitup...done
Tomakeuseofthetoolchain,firstsourcetheenvironmentandsetupthescript:
$source/opt/poky/2.2.1/environment-setup-cortexa8hf-neon-poky
-linux-gnueabi
Theenvironment-setup-*scriptthatsetsthingsupfortheSDKisnotcompatiblewiththeoe-init-build-envscriptthatyousourcewhen
workingintheYoctoProjectbuilddirectory.Itisagoodruletoalwaysstartanewterminalsessionbeforeyousourceeitherscript.
ThetoolchaingeneratedbyYoctoProjectdoesnothaveavalidsysrootdirectory:
$arm-poky-linux-gnueabi-gcc-print-sysroot
/not/exist
Consequently,ifyoutrytocrosscompile,asIhaveshowninpreviouschapters,itwillfaillikethis:
$arm-poky-linux-gnueabi-gcchelloworld.c-ohelloworld
helloworld.c:1:19:fatalerror:stdio.h:Nosuchfileordirectory
#include<stdio.h>
^
compilationterminated.
ThisisbecausethecompilerhasbeenconfiguredtoworkforawiderangeofARMprocessors,andthefinetuningisdonewhenyoulaunchitusingtherightsetofflags.Instead,youshouldusetheshellvariablesthatarecreatedwhenyousourcetheenvironment-setupscriptforcrosscompiling.Theyincludethese:
CC:TheCcompilerCXX:TheC++compilerCPP:TheCpreprocessorAS:TheassemblerLD:Thelinker
Asanexample,thisiswhatwefindthatCChasbeensettothis:
$echo$CC
arm-poky-linux-gnueabi-gcc-march=armv7-a-mfpu=neon
-mfloat-abi=hard-mcpu=cortex-a8--sysroot=/opt/poky/
2.2.1/sysroots/cortexa8hf-neon-poky-linux-gnueabi
Solongasyouuse$CCtocompile,everythingshouldworkfine:
$$CChelloworld.c-ohelloworld
ThelicenseauditTheYoctoProjectinsiststhateachpackagehasalicense.Acopyofthelicenseisplacedintmp/deploy/licenses/[packagename]foreachpackageasitisbuilt.Inaddition,asummaryofthepackagesandlicensesusedinanimageareputintothedirectory:<imagename>-<machinename>-<datestamp>/.Fornova-imagewejustbuilt,thedirectorywouldbenamedsomethinglikethis:
tmp/deploy/licenses/nova-image-beaglebone-20170417192546/
FurtherreadingYoumaywanttolookatthefollowingdocumentationformoreinformation:
TheBuildrootUserManual,http://buildroot.org/downloads/manual/manual.htmlInstantBuildroot,byDanielManchónVizuete,PacktPublishing,2013YoctoProjectdocumentation:Thereareninereferenceguidesplusatenthwhichisacompositeoftheothers(theso-calledMega-manual)athttps://www.yoctoproject.org/documentationEmbeddedLinuxSystemswiththeYoctoProject,byRudolfJ.Streif,PrenticeHall,2016EmbeddedLinuxProjectsUsingYoctoProjectCookbook,byAlexGonzalez,PacktPublishing,2015
SummaryUsingabuildsystemtakesthehardworkoutofcreatinganembeddedLinuxsystem,anditisalmostalwaysbetterthanhandcraftingaroll-your-ownsystem.Thereisarangeofopensourcebuildsystemsavailablethesedays:BuildrootandtheYoctoProjectrepresenttwodifferentapproaches.Buildrootissimpleandquick,makingitagoodchoiceforfairlysimplesingle-purposedevices:traditionalembeddedLinuxasIliketothinkofthem.TheYoctoProjectismorecomplexandflexible.Itispackagebased,meaningthatyouhavetheoptiontoinstallapackagemanagerandperformupdatesofindividualpackagesinthefield.Themetalayerstructuremakesiteasytoextendthemetadata,andindeedthereisgoodsupportthroughoutthecommunityandindustryfortheYoctoProject.Thedownsideisthatthereisaverysteeplearningcurve:youshouldexpectittotakeseveralmonthstobecomeproficientwithit,andeventhenitwillsometimesdothingsthatyoudon'texpect,oratleastthatismyexperience.
Don'tforgetthatanydevicesyoucreateusingthesetoolswillneedtobemaintainedinthefieldforaperiodoftime,oftenmanyyears.BothYoctoProjectandBuildrootprovidepointreleasesforaboutoneyearaftertheinitialrelease.Ineithercase,youwillfindyourselfhavingtomaintainyourreleaseyourselforelsepayingforcommercialsupport.Thethirdpossibility,ignoringtheproblem,shouldnotbeconsideredanoption!
Inthenextchapter,Iwilllookatfilestorageandfilesystems,andatthewaythatthechoicesyoumaketherewillaffectthestabilityandmaintainabilityofyourembeddedLinux.
CreatingaStorageStrategyThemass-storageoptionsforembeddeddeviceshaveagreatimpactontherestofthesystemintermsofrobustness,speed,andmethodsofin-fieldupdates.Mostdevicesemployflashmemoryinsomeformorother.Flashmemoryhasbecomemuchlessexpensiveoverthepastfewyearsasstoragecapacitieshaveincreasedfromtensofmegabytestotensofgigabytes.
Inthischapter,Iwillbeginwithadetailedlookatthetechnologybehindflashmemoryandhowdifferentmemoryorganizationaffectsthelow-leveldriversoftwarethathastomanageit,includingtheLinuxmemorytechnologydevicelayer,MTD.
Foreachflashtechnology,therearedifferentchoicesoffilesystem.Iwilldescribethosemostcommonlyfoundonembeddeddevicesandcompletethesurveywithasectiongivingasummaryofchoicesforeachtypeofflashmemory.Thefinalsectionsconsidertechniquestomakethebestuseofflashmemoryanddraweverythingtogetherintoacoherentstoragestrategy.
Wewillcoverthefollowingtopics:
StorageoptionsAccessingflashmemoryfromthebootloaderAccessingflashmemoryfromLinuxFilesystemsforflashmemoryFilesystemsforNORandNANDflashmemoryFilesystemsformanagedflashRead-onlycompressedfilesystemsTemporaryfilesystemsMakingtherootfilesystemread-onlyFilesystemchoices
StorageoptionsEmbeddeddevicesneedstoragethattakeslittlepowerandisphysicallycompact,robust,andreliableoveralifetimeofperhapstensofyears.Inalmostallcases,thismeanssolid-statestorage.Solid-statestoragewasintroducedmanyyearsagowithread-onlymemory(ROM),butforthepast20years,ithasbeenflashmemoryofsomekind.Therehavebeenseveralgenerationsofflashmemoryinthattime,progressingfromNORtoNANDtomanagedflashsuchaseMMC.
NORflashisexpensivebutreliableandcanbemappedintotheCPUaddressspace,whichallowsyoutoexecutecodedirectlyfromflash.NORflashchipsarelowcapacity,rangingfromafewmegabytestoagigabyteorso.
NANDflashmemoryismuchcheaperthanNORandisavailableinhighercapacities,intherangeoftensofmegabytestotensofgigabytes.However,itneedsalotofhardwareandsoftwaresupporttoturnitintoausefulstoragemedium.
ManagedflashmemoryconsistsofoneormoreNANDflashchipspackagedwithacontrollerthathandlesthecomplexitiesofflashmemoryandpresentsahardwareinterfacesimilartothatofaharddisk.Theattractionisthatitremovescomplexityfromthedriversoftwareandinsulatesthesystemdesignerfromthefrequentchangesinflashtechnology.SDcards,eMMCchips,andUSBflashdrivesfitintothiscategory.AlmostallofthecurrentgenerationofsmartphonesandtabletshaveeMMCstorage,andthistrendislikelytoprogresswithothercategoriesofembeddeddevices.
Harddrivesareseldomfoundinembeddedsystems.Oneexceptionisdigitalvideorecordinginset-topboxesandsmartTVs,inwhichalargeamountofstorageisneededwithfastwritetimes.
Inallcases,robustnessisofprimeimportance:youwantthedevicetobootandreachafunctionalstatedespitepowerfailuresandunexpectedresets.Youshouldchoosefilesystemsthatbehavewellundersuchcircumstances.
NORflashThememorycellsinNORflashchipsarearrangedintoeraseblocksof,forexample,128KiB.Erasingablocksetsallthebitsto1.Itcanbeprogrammedonewordatatime(8,16,or32bits,dependingonthedatabuswidth).Eacherasecycledamagesthememorycellsslightly,andafteranumberofcycles,theeraseblockbecomesunreliableandcannotbeusedanymore.Themaximumnumberoferasecyclesshouldbegiveninthedatasheetforthechipbutisusuallyintherangeof100Kto1M.
Thedatacanbereadwordbyword.ThechipisusuallymappedintotheCPUaddressspace,whichmeansthatyoucanexecutecodedirectlyfromNORflash.Thismakesitaconvenientplacetoputthebootloadercodeasitneedsnoinitializationbeyondhardwiringtheaddressmapping.SoCsthatsupportNORflashinthiswayhaveconfigurationsthatprovideadefaultmemorymappingsuchthatitencompassestheresetvectoroftheCPU.
Thekernel,andeventherootfilesystem,canalsobelocatedinflashmemory,avoidingtheneedforcopyingthemintoRAM,andthuscreatingdeviceswithsmallmemoryfootprints.ThetechniqueisknownaseXecuteInPlace,orXIP.ItisveryspecializedandIwillnotexamineitfurtherhere.Ihaveincludedsomereferencesattheendofthechapter.
Thereisastandardregister-levelinterfaceforNORflashchipscalledtheCommonFlashInterfaceorCFI,whichallmodernchipssupport.TheCFIisdescribedinstandardJESD68,whichyoucangetfromhttps://www.jedec.org/.
NANDflashNANDflashismuchcheaperthanNORflashandhasahighercapacity.First-generationNANDchipsstoredonebitpermemorycellinwhatisnowknownasanSLCorsingle-levelcellorganization.Latergenerationsmovedontotwobitspercellinmulti-levelcell(MLC)chipsandnowtothreebitspercellintri-levelcell(TLC)chips.Asthenumberofbitspercellhasincreased,thereliabilityofthestoragehasdecreased,requiringmorecomplexcontrollerhardwareandsoftwaretocompensate.Wherereliabilityisaconcern,youshouldmakesureyouareusingSLCNANDflashchips.
AswithNORflash,NANDflashisorganizedintoeraseblocksranginginsizefrom16KiBto512KiBand,onceagain,erasingablocksetsallthebitsto1.However,thenumberoferasecyclesbeforetheblockbecomesunreliableislower,typicallyasfewas1KcyclesforTLCchipsandupto100KforSLC.NANDflashcanonlybereadandwritteninpages,usuallyof2or4KiB.Sincetheycannotbeaccessedbyte-by-byte,theycannotbemappedintotheaddressspaceandsocodeanddatahavetobecopiedintoRAMbeforetheycanbeaccessed.
Datatransferstoandfromthechiparepronetobitflips,whichcanbedetectedandcorrectedusingerror-correctioncodes(ECCs).SLCchipsgenerallyuseasimpleHammingcode,whichcanbeimplementedefficientlyinsoftwareandcancorrectasingle-biterrorinapageread.MLCandTLCchipsneedmoresophisticatedcodes,suchasBose-Chaudhuri-Hocquenghem(BCH),whichcancorrectupto8-biterrorsperpage.Theseneedhardwaresupport.
TheECCshavetobestoredsomewhere,andsothereisanextraareaofmemoryperpageknownastheout-of-band(OOB)area,orthesparearea.SLCdesignsusuallyhave1byteofOOBper32bytesofmainstorage,sofora2KiBpagedevice,theOOBis64bytesperpage,andfora4KiBpage,itis128bytes.MLCandTLCchipshaveproportionallylargerOOBareastoaccommodatemorecomplexECCs.Thefollowingdiagramshowstheorganizationofachipwitha128KiBeraseblockand2KiBpages:
Duringproduction,themanufacturertestsalltheblocksandmarksanythatfailbysettingaflagintheOOBareaofeachpageintheblock.Itisnotuncommontofindthatbrandnewchipshaveupto2%oftheirblocksmarkedbadinthisway.Furthermore,itiswithinthespecificationforasimilarproportionofblockstogiveerrorsonerasebeforetheerasecyclelimitisreached.TheNANDflashdrivershoulddetectthisandmarkitasbad.
AfterspacehasbeentakenintheOOBareaforabadblockflagandECCbytes,therearestillsomebytesleft.Someflashfilesystemsmakeuseofthesefreebytestostorefilesystemmetadata.Consequently,manypartsofthesystemareinterestedinthelayoutoftheOOBarea:theSoCROMbootcode,thebootloader,thekernelMTDdriver,thefilesystemcode,andthetoolstocreatefilesystemimages.Thereisnotmuchstandardization,soitiseasytogetintoasituationinwhichthebootloaderwritesdatausinganOOBformatthatcannotbereadbythekernelMTDdriver.Itisuptoyoutomakesurethattheyallagree.
AccesstoNANDflashchipsrequiresaNANDflashcontroller,whichisusuallypartoftheSoC.Youwillneedthecorrespondingdriverinthebootloaderandkernel.TheNANDflashcontrollerhandlesthehardwareinterfacetothechip,transferringdatatoandfrompages,andmayincludehardwareforerrorcorrection.
Thereisastandardregister-levelinterfaceforNANDflashchipsknownastheOpenNANDFlashInterfaceorONFi,whichmostmodernchipsadhereto.Seehttp://www.onfi.org/.
ManagedflashTheburdenofsupportingflashmemoryintheoperatingsystem,NANDinparticular,becomeslessifthereisawell-definedhardwareinterfaceandastandardflashcontrollerthathidesthecomplexitiesofthememory.Thisismanagedflashmemory,anditisbecomingmoreandmorecommon.Inessence,itmeanscombiningoneormoreflashchipswithamicrocontrollerthatoffersanidealstoragedevicewithasmallsectorsizethatiscompatiblewithconventionalfilesystems.ThemostimportanttypesofchipsforembeddedsystemsareSecureDigital(SD)cardsandtheembeddedvariantknownaseMMC.
MultiMediaCardandSecureDigitalcardsTheMultiMediaCard(MMC)wasintroducedin1997bySanDiskandSiemensasaformofpackagedstorageusingflashmemory.Shortlyafter,in1999,SanDisk,Matsushita,andToshibacreatedtheSecureDigital(SD)card,whichisbasedonMMCbutaddsencryptionandDRM(thesecureinthename).Bothwereintendedforconsumerelectronicssuchasdigitalcameras,musicplayers,andsimilardevices.Currently,SDcardsarethedominantformofmanagedflashforconsumerandembeddedelectronics,eventhoughtheencryptionfeaturesareseldomused.NewerversionsoftheSDspecificationallowsmallerpackaging(miniSDandmicroSD,whichisoftenwrittenasuSD)andlargercapacities:highcapacitySDHCupto32GBandextendedcapacitySDXCupto2TB.
ThehardwareinterfaceforMMCandSDcardsisverysimilar,anditispossibletousefull-sizedMMCcardsinfull-sizedSDcardslots(butnottheotherwayround).Earlyincarnationsuseda1-bitSerialPeripheralInterface(SPI);morerecentcardsusea4-bitinterface.
Thereisacommandsetforreadingandwritingmemoryinsectorsof512bytes.InsidethepackageisamicrocontrollerandoneormoreNANDflashchips,asshowninthefollowingdiagram:
Themicrocontrollerimplementsthecommandsetandmanagestheflashmemory,performingthefunctionofaflashtranslationlayer,asdescribedlateroninthischapter.TheyarepreformattedwithaFATfilesystem:FAT16onSDSCcards,FAT32onSDHC,andexFATonSDXC.ThequalityoftheNANDflashchips
andthesoftwareonthemicrocontrollervariesgreatlybetweencards.Itisquestionablewhetheranyofthemaresufficientlyreliablefordeepembeddeduse,andcertainlynotwithaFATfilesystem,whichispronetofilecorruption.RememberthattheprimeusecaseforMMCandSDcardsisforremovablestorageoncameras,tablets,andphones.
eMMCeMMCorEmbeddedMMCissimplyMMCmemorypackagedsothatitcanbesolderedontothemotherboard,usinga4-or8-bitinterfacefordatatransfer.However,theyareintendedtobeusedasstorageforanoperatingsystemsothecomponentsarecapableofperformingthattask.Thechipsareusuallynotpreformattedwithanyfilesystem.
OthertypesofmanagedflashOneofthefirstmanagedflashtechnologieswasCompactFlash(CF),whichusesasubsetofthePersonalComputerMemoryCardInternationalAssociation(PCMCIA)hardwareinterface.CFexposesthememorythroughaparallelATAinterfaceandappearstotheoperatingsystemasastandardharddisk.Theyarecommoninx86-basedsingleboardcomputersandprofessionalvideoandcameraequipment.
OneotherformatthatweuseeverydayistheUSBflashdrive.Inthiscase,thememoryisaccessedthroughaUSBinterfaceandthecontrollerimplementstheUSBmassstoragespecificationaswellastheflashtranslationlayerandinterfacetotheflashchip,orchips.TheUSBmassstorageprotocol,inturn,isbasedontheSCSIdiskcommandset.AswithMMCandSDcards,theyareusuallypreformattedwithaFATfilesystem.TheirmainusecaseinembeddedsystemsistoexchangedatawithPCs.
ArecentadditiontothelistofoptionsformanagedflashstorageisUniversalFlashStorage(UFS).LikeeMMC,itispackagedinachipthatismountedonthemotherboard.Ithasahigh-speedserialinterfaceandcanachievedataratesgreaterthaneMMC.ItsupportsaSCSIdiskcommandset.
AccessingflashmemoryfromthebootloaderInChapter3,AllAboutBootloaders,Imentionedtheneedforthebootloadertoloadkernelbinariesandotherimagesfromvariousflashdevices,andtoperformsystemmaintenancetaskssuchaserasingandreprogrammingflashmemory.Itfollowsthatthebootloadermusthavethedriversandinfrastructureneededtosupportread,erase,andwriteoperationsonthetypeofmemoryyouhave,whetheritbeNOR,NAND,ormanaged.IwilluseU-Bootinthefollowingexamples;otherbootloadersfollowasimilarpattern.
U-BootandNORflashU-BoothasdriversforNORCFIchipsindrivers/mtdandhasthecommandserasetoerasememoryandcp.btocopydatabytebybyte,programmingtheflashcells.SupposethatyouhaveNORflashmemorymappedfrom0x40000000to0x48000000,ofwhich4MiBstartingat0x40040000isakernelimage,thenyouwouldloadanewkernelintoflashusingtheseU-Bootcommands:
U-Boot#tftpboot100000uImage
U-Boot#erase40040000403fffff
U-Boot#cp.b10000040040000$(filesize)
Thevariablefilesizeintheprecedingexampleissetbythetftpbootcommandtothesizeofthefilejustdownloaded.
U-BootandNANDflashForNANDflash,youneedadriverfortheNANDflashcontrolleronyourSoC,whichyoucanfindintheU-Bootsourcecodeinthedirectorydrivers/mtd/nand.Youusethenandcommandtomanagethememoryusingthesub-commandserase,write,andread.ThisexampleshowsakernelimagebeingloadedintoRAMat0x82000000andthenplacedintoflashstartingatoffset0x280000:
U-Boot#tftpboot82000000uImage
U-Boot#nanderase280000400000
U-Boot#nandwrite82000000280000$(filesize)
U-BootcanalsoreadfilesstoredintheJFFS2,YAFFS2,andUBIFSfilesystems.
U-BootandMMC,SD,andeMMCU-BoothasdriversforseveralMMCcontrollersindrivers/mmc.Youcanaccesstherawdatausingmmcreadandmmcwriteattheuserinterfacelevel,whichallowsyoutohandlerawkernelandfilesystemimages.
U-bootcanalsoreadfilesfromtheFAT32andext4filesystemsonMMCstorage.
AccessingflashmemoryfromLinuxRawNORandNANDflashmemoryishandledbytheMemoryTechnologyDevicesubsystem,orMTD,whichprovidesbasicinterfacestoread,erase,andwriteblocksofflashmemory.InthecaseofNANDflash,therearealsofunctionstohandletheOOBareaandtoidentifybadblocks.
Formanagedflash,youneeddriverstohandletheparticularhardwareinterface.MMC/SDcardsandeMMCusethemmcblkdriver;CompactFlashandharddrivesusetheSCSIdiskdriver,sd.USBflashdrivesusetheusb_storagedrivertogetherwiththesddriver.
MemorytechnologydevicesTheMTDsubsystemwasstartedbyDavidWoodhousein1999andhasbeenextensivelydevelopedovertheinterveningyears.Inthissection,Iwillconcentrateonthewayithandlesthetwomaintechnologies,NORandNANDflash.
MTDconsistsofthreelayers:acoresetoffunctions,asetofdriversforvarioustypesofchips,anduser-leveldriversthatpresenttheflashmemoryasacharacterdeviceorablockdevice,asshowninthefollowingdiagram:
Thechipdriversareatthelowestlevelandinterfacewithflashchips.OnlyasmallnumberofdriversareneededforNORflashchips,enoughtocovertheCFIstandardandvariationsplusafewnon-compliantchips,whicharenowmostlyobsolete.ForNANDflash,youwillneedadriverfortheNANDflashcontrolleryouareusing;thisisusuallysuppliedaspartoftheboardsupportpackage.Therearedriversforabout40oftheminthecurrentmainlinekernelinthedirectorydrivers/mtd/nand.
MTDpartitionsInmostcases,youwillwanttopartitiontheflashmemoryintoanumberofareas,forexample,toprovidespaceforabootloader,akernelimage,orarootfilesystem.InMTD,thereareseveralwaystospecifythesizeandlocationofpartitions,themainonesbeing:
ThroughthekernelcommandlineusingCONFIG_MTD_CMDLINE_PARTSViathedevicetreeusingCONFIG_MTD_OF_PARTSWithaplatform-mappingdriver
Inthecaseofthefirstoption,thekernelcommand-lineoptiontouseismtdparts,whichisdefinedasfollowsintheLinuxsourcecodeindrivers/mtd/cmdlinepart.c:
mtdparts=<mtddef>[;<mtddef]
<mtddef>:=<mtd-id>:<partdef>[,<partdef>]
<mtd-id>:=uniquenameforthechip
<partdef>:=<size>[@<offset>][<name>][ro][lk]
<size>:=sizeofpartitionOR"-"todenoteallremaining
space
<offset>:=offsettothestartofthepartition;leaveblank
tofollowthepreviouspartitionwithoutanygap
<name>:='('NAME')'
Perhapsanexamplewillhelp.Imaginethatyouhaveoneflashchipof128MiBthatistobedividedintofivepartitions.Atypicalcommandlinewouldbethis:
mtdparts=:512k(SPL)ro,780k(U-Boot)ro,128k(U-BootEnv),
4m(Kernel),-(Filesystem)
Thefirstelement,beforethecolon,ismtd-id,whichidentifiestheflashchip,eitherbynumberorbythenameassignedbytheboardsupportpackage.Ifthereisonlyonechip,ashere,itcanbeleftempty.Ifthereismorethanonechip,theinformationforeachisseparatedbyasemicolon.Then,foreachchip,thereisacomma-separatedlistofpartitions,eachwithasizeinbytes,KiB(k)orMiB(m)andanameinparentheses.Therosuffixmakesthepartitionread-onlytoMTDandisoftenusedtopreventaccidentaloverwritingofthebootloader.Thesizeofthelastpartitionforthechipmaybereplacedbyadash(-),indicatingthatitshouldtakeupalltheremainingspace.
Youcanseeasummaryoftheconfigurationatruntimebyreading/proc/mtd:
#cat/proc/mtd
dev:sizeerasesizename
mtd0:0008000000020000"SPL"
mtd1:000C300000020000"U-Boot"
mtd2:0002000000020000"U-BootEnv"
mtd3:0040000000020000"Kernel"
mtd4:07A9D00000020000"Filesystem"
Thereismoredetailedinformationforeachpartitionin/sys/class/mtd,includingtheeraseblocksizeandthepagesize,anditisnicelysummarizedusingmtdinfo:
#mtdinfo/dev/mtd0
mtd0
Name:SPL
Type:nand
Eraseblocksize:131072bytes,128.0KiB
Amountoferaseblocks:4(524288bytes,512.0KiB)
Minimuminput/outputunitsize:2048bytes
Sub-pagesize:512bytes
OOBsize:64bytes
Characterdevicemajor/minor:90:0
Badblocksareallowed:true
Deviceiswritable:false
AnotherwayofspecifyingMTDpartitionsisthroughthedevicetree.Hereisanexamplethatcreatesthesamepartitionsasthecommand-lineexample:
nand@0,0{
#address-cells=<1>;
#size-cells=<1>;
partition@0{
label="SPL";
reg=<00x80000>;
};
partition@80000{
label="U-Boot";
reg=<0x800000xc3000>;
};
partition@143000{
label="U-BootEnv";
reg=<0x1430000x20000>;
};
partition@163000{
label="Kernel";
reg=<0x1630000x400000>;
};
partition@563000{
label="Filesystem";
reg=<0x5630000x7a9d000>;
};
};
Athirdalternativeistocodethepartitioninformationasplatformdatainanmtd_partitionstructure,asshowninthisexampletakenfromarch/arm/mach-omap2/board-omap3beagle.c(NAND_BLOCK_SIZEisdefinedelsewheretobe128KiB):
staticstructmtd_partitionomap3beagle_nand_partitions[]={
{
.name="X-Loader",
.offset=0,
.size=4*NAND_BLOCK_SIZE,
.mask_flags=MTD_WRITEABLE,/*forceread-only*/
},
{
.name="U-Boot",
.offset=0x80000;
.size=15*NAND_BLOCK_SIZE,
.mask_flags=MTD_WRITEABLE,/*forceread-only*/
},
{
.name="U-BootEnv",
.offset=0x260000;
.size=1*NAND_BLOCK_SIZE,
},
{
.name="Kernel",
.offset=0x280000;
.size=32*NAND_BLOCK_SIZE,
},
{
.name="FileSystem",
.offset=0x680000;
.size=MTDPART_SIZ_FULL,
},
};
Platformdataisdeprecated:youwillonlyfinditusedinBSPsforoldSoCsthathavenotbeenupdatedtouseadevicetree.
MTDdevicedriversTheupperleveloftheMTDsubsystemisapairofdevicedrivers:
Acharacterdevice,withamajornumberof90.TherearetwodevicenodesperMTDpartitionnumber,N:/dev/mtdN(minornumber=N*2)and/dev/mtdNro(minornumber=(N*2+1)).Thelatterisjustaread-onlyversionoftheformer.Ablockdevice,withamajornumberof31andaminornumberofN.Thedevicenodesareintheform/dev/mtdblockN.
TheMTDcharacterdevice,mtdThecharacterdevicesarethemostimportant:theyallowyoutoaccesstheunderlyingflashmemoryasanarrayofbytessothatyoucanreadandwrite(program)theflash.ItalsoimplementsanumberofioctlfunctionsthatallowyoutoeraseblocksandtomanagetheOOBareaonNANDchips.Thefollowinglististakenfrominclude/uapi/mtd/mtd-abi.h:
IOCTL Description
MEMGETINFO GetsbasicMTDcharacteristicinformation
MEMERASE ErasesblocksintheMTDpartitionMEMWRITEOOB Writesout-of-banddataforthepageMEMREADOOB Readsout-of-banddataforthepageMEMLOCK Locksthechip(ifsupported)MEMUNLOCK Unlocksthechip(ifsupported)
MEMGETREGIONCOUNT
Getsthenumberoferaseregions:non-zeroifthereareeraseblocksofdifferingsizesinthepartition,whichiscommonforNORflash,rareonNAND
MEMGETREGIONINFOIfMEMGETREGIONCOUNTisnon-zero,thiscanbeusedtogettheoffset,size,andblockcountofeachregion
MEMGETOOBSEL DeprecatedMEMGETBADBLOCK ThisgetsthebadblockflagMEMSETBADBLOCK Thissetsthebadblockflag
OTPSELECTThissetsOTP(one-timeprogrammable)mode,ifthechipsupportsit
OTPGETREGIONCOUNT ThisgetsthenumberofOTPregionsOTPGETREGIONINFO ThisgetsinformationaboutanOTPregion
ECCGETLAYOUT Deprecated
Thereisasetofutilityprogramsknownasmtd-utilsformanipulatingflashmemorythatmakesuseoftheseioctlfunctions.Thesourceisavailablefromgit://git.infradead.org/mtd-utils.git,andisavailableasapackageintheYoctoProjectandBuildroot.Theessentialtoolsareshowninthefollowinglist.ThepackagealsocontainsutilitiesfortheJFFS2andUBI/UBIFSfilesystems,whichIwillcoverlater.Foreachofthesetools,theMTDcharacterdeviceisoneoftheparameters:
flash_erase:Erasesarangeofblocks.flash_lock:Locksarangeofblocks.flash_unlock:Unlocksarangeofblocks.nanddump:DumpsmemoryfromNANDflash,optionallyincludingtheOOBarea.Skipsbadblocks.nandtest:TestsanddiagnosticsforNANDflash.nandwrite:Writes(programs)datafromafileintoNANDflash,skippingbadblocks.
Youmustalwayseraseflashmemorybeforewritingnewcontentstoit:flash_eraseisthecommandtodothat.
ToprogramNORflash,yousimplycopybytestotheMTDdevicenodeusingafilecopycommandsuchascp.
Unfortunately,thisdoesn'tworkwithNANDmemoryasthecopywillfailatthefirstbadblock.Instead,usenandwrite,whichskipsoveranybadblocks.ToreadbackNANDmemory,youshouldusenanddump,whichalsoskipsbadblocks.
TheMTDblockdevice,mtdblockThemtdblockdriverislittleused.Itspurposeistopresentflashmemoryasablockdeviceyoucanusetoformatandmountasafilesystem.However,ithasseverelimitationsbecauseitdoesnothandlebadblocksinNANDflash,itdoesnotdowearleveling,anditdoesnothandlethemismatchinsizebetweenfilesystemblocksandflasheraseblocks.Inotherwords,itdoesnothaveaflashtranslationlayer,whichisessentialforreliablefilestorage.Theonlycasewherethemtdblockdeviceisusefulistomountread-onlyfilesystemssuchasSquashfsontopofreliableflashmemorysuchasNOR.
Ifyouwantaread-onlyfilesystemonNANDflash,youshouldusetheUBIdriver,asdescribedlaterinthischapter.
LoggingkerneloopstoMTDAkernelerror,oroops,isnormallyloggedviatheklogdandsyslogddaemonstoacircularmemorybufferorafile.Followingareboot,thelogwillbelostinthecaseofaringbuffer,andeveninthecaseofafile,itmaynothavebeenproperlywrittentobeforethesystemcrashed.AmorereliablemethodistowriteoopsandkernelpanicstoanMTDpartitionasacircularlogbuffer.YouenableitwithCONFIG_MTD_OOPSandaddconsole=ttyMTDNtothekernelcommandline,NbeingtheMTDdevicenumbertowritethemessagesto.
SimulatingNANDmemoryTheNANDsimulatoremulatesaNANDchipusingsystemRAM.ThemainuseisfortestingcodethathastobeNAND-awarewithoutaccesstophysicalNANDmemory.Inparticular,theabilitytosimulatebadblocks,bitflips,andothererrorsallowsyoutotestcodepathsthataredifficulttoexerciseusingrealflashmemory.Formoreinformation,thebestplacetolookisinthecodeitself,whichhasacomprehensivedescriptionofthewaysyoucanconfigurethedriver.Thecodeisindrivers/mtd/nand/nandsim.c.EnableitwiththekernelconfigurationCONFIG_MTD_NAND_NANDSIM.
TheMMCblockdriverMMC/SDcardsandeMMCchipsareaccessedusingthemmcblkblockdriver.YouneedahostcontrollertomatchtheMMCadapteryouareusing,whichispartoftheboardsupportpackage.ThedriversarelocatedintheLinuxsourcecodeindrivers/mmc/host.
MMCstorageispartitionedusingapartitiontableinexactlythesamewayyouwouldforharddisks,usingfdiskorasimilarutility.
FilesystemsforflashmemoryThereareseveralchallengeswhenmakingefficientuseofflashmemoryformassstorage:themismatchbetweenthesizeofaneraseblockandadisksector,thelimitednumberoferasecyclespereraseblock,andtheneedforbadblockhandlingonNANDchips.ThesedifferencesareresolvedbyaFlashtranslationlayer,orFTL.
FlashtranslationlayersAflashtranslationlayerhasthefollowingfeatures:
Suballocation:Filesystemsworkbestwithasmallallocationunit,traditionallya512-bytesector.Thisismuchsmallerthanaflasheraseblockof128KiBormore.Therefore,eraseblockshavetobesubdividedintosmallerunitstoavoidwastinglargeamountsofspace.Garbagecollection:Aconsequenceofsuballocationisthataneraseblockwillcontainamixtureofgooddataandstaledataafterthefilesystemhasbeeninuseforawhile.Sincewecanonlyfreeupwholeeraseblocks,theonlywaytoreclaimthefreespaceistocoalescethegooddataintooneplaceandreturnthenowemptyeraseblocktothefreelist:thisisgarbagecollection,andisusuallyimplementedasabackgroundthread.Wearleveling:Thereisalimitonthenumberoferasecyclesforeachblock.Tomaximizethelifespanofachip,itisimportanttomovedataaroundsothateachblockiserasedroughlythesamenumberoftimes.Badblockhandling:OnNANDflashchips,youhavetoavoidusinganyblockmarkedbadandalsomarkgoodblocksasbadiftheycannotbeerased.Robustness:Embeddeddevicesmaybepoweredofforresetwithoutwarning,soanyfilesystemshouldbeabletocopewithoutcorruption,usuallybyincorporatingajournalorlogoftransactions.
Thereareseveralwaystodeploytheflashtranslationlayer:
Inthefilesystem:aswithJFFS2,YAFFS2,andUBIFSIntheblockdevicedriver:theUBIdriver,onwhichUBIFSdepends,implementssomeaspectsofaflashtranslationlayerInthedevicecontroller:aswithmanagedflashdevices
Whentheflashtranslationlayerisinthefilesystemortheblockdriver,thecodeispartofthekernelandsoitisopensource,meaningthatwecanseehowitworksandwecanexpectthatitwillbeimprovedovertime.Ontheotherhand,iftheFTLisinsideamanagedflashdevice,itishiddenfromviewandwe
cannotverifywhetherornotitworksaswewouldwant.Notonlythat,butputtingtheFTLintothediskcontrollermeansthatitmissesoutoninformationthatisheldatthefilesystemlayer,suchaswhichsectorsbelongtofilesthathavebeendeletedandsodonotcontainusefuldataanymore.Thelatterproblemissolvedbyaddingcommandsthatpassthisinformationbetweenthefilesystemandthedevice.IwilldescribehowthisworksinthesectionontheTRIMcommandlateron.However,thequestionofcodevisibilityremains.Ifyouareusingmanagedflash,youjusthavetochooseamanufactureryoucantrust.
FilesystemsforNORandNANDflashmemoryTouserawflashchipsformassstorage,youhavetouseafilesystemthatunderstandsthepeculiaritiesoftheunderlyingtechnology.Therearethreesuchfilesystems:
JFFS2(JournalingFlashFileSystem2):ThiswasthefirstflashfilesystemforLinux,andisstillinusetoday.ItworksforNORandNANDmemory,butisnotoriouslyslowduringmount.YAFFS2(YetAnotherFlashFileSystem2):ThisissimilartoJFFS2,butspecificallyforNANDflashmemory.ItwasadoptedbyGoogleasthepreferredrawflashfilesystemonAndroiddevices.UBIFS(UnsortedBlockImageFileSystem):ThisworksinconjunctionwiththeUBIblockdrivertocreateareliableflashfilesystem.ItworkswellwithbothNORandNANDmemory,andsinceitgenerallyoffersbetterperformancethanJFFS2orYAFFS2,itshouldbethepreferredsolutionfornewdesigns.
AlloftheseuseMTDasthecommoninterfacetoflashmemory.
JFFS2TheJournalingFlashFileSystemhaditsbeginningsinthesoftwarefortheAxis2100networkcamerain1999.Formanyyears,itwastheonlyflashfilesystemforLinuxandhasbeendeployedonmanythousandsofdifferenttypesofdevices.Today,itisnotthebestchoice,butIwillcoveritfirstbecauseitshowsthebeginningoftheevolutionarypath.
JFFS2isalog-structuredfilesystemthatusesMTDtoaccessflashmemory.Inalog-structuredfilesystem,changesarewrittensequentiallyasnodestotheflashmemory.Anodemaycontainchangestoadirectory,suchasthenamesoffilescreatedanddeleted,oritmaycontainchangestofiledata.Afterawhile,anodemaybesupersededbyinformationcontainedinsubsequentnodesandbecomesanobsoletenode.
Eraseblocksarecategorizedintothreetypes:
Free:ThiscontainsnonodesatallClean:ThiscontainsonlyvalidnodesDirty:Thiscontainsatleastoneobsoletenode
Atanyonetime,thereisoneblockreceivingupdates,whichiscalledtheopenblock.Ifpowerislostorthesystemisreset,theonlydatathatcanbelostisthelastwritetotheopenblock.Inaddition,nodesarecompressedastheyarewritten,increasingtheeffectivestoragecapacityoftheflashchip,whichisimportantifyouareusingexpensiveNORflashmemory.
Whenthenumberoffreeblocksfallsbelowathreshold,agarbage-collectorkernelthreadisstarted,whichscansfordirtyblocks,copiesthevalidnodesintotheopenblock,andthenfreesupthedirtyblock.
Atthesametime,thegarbagecollectorprovidesacrudeformofwearlevelingbecauseitcyclesvaliddatafromoneblocktoanother.Thewaythattheopenblockischosenmeansthateachblockiserasedroughlythesamenumberoftimessolongasitcontainsdatathatchangesfromtimetotime.Sometimesacleanblockischosenforgarbagecollectiontomakesurethatblockscontaining
staticdatathatisseldomwrittenarealsowear-leveled.
JFFS2filesystemshaveawrite-throughcache,meaningthatwritesarewrittentotheflashmemorysynchronouslyasiftheyhavebeenmountedwiththe-osyncoption.Whileimprovingreliability,itdoesincreasethetimetowritedata.Thereisafurtherproblemwithsmallwrites:ifthelengthofawriteiscomparabletothesizeofthenodeheader(40bytes)theoverheadbecomeshigh.Awell-knowncornercaseislogfiles,produced,forexample,bysyslogd.
SummarynodesThereisoneoverridingdisadvantagetoJFFS2:sincethereisnoon-chipindex,thedirectorystructurehastobededucedatmount-timebyreadingthelogfromstarttofinish.Attheendofthescan,youhaveacompletepictureofthedirectorystructureofthevalidnodes,butthetimetakenisproportionaltothesizeofthepartition.Itisnotuncommontoseemounttimesoftheorderofonesecondpermegabyte,leadingtototalmounttimesoftensorhundredsofseconds.
Toreducethetimetoscanduringmount,summarynodesbecameanoptioninLinux2.6.15.Asummarynodeiswrittenattheendoftheopeneraseblockjustbeforeitisclosed.Thesummarynodecontainsalloftheinformationneededforthemount-timescan,therebyreducingtheamountofdatatoprocessduringthescan.Summarynodescanreducemounttimesbyafactorofbetweentwoandfive,attheexpenseofanoverheadofabout5%ofthestoragespace.TheyareenabledwiththekernelconfigurationCONFIG_JFFS2_SUMMARY.
CleanmarkersAnerasedblockwithallbitssetto1isindistinguishablefromablockthathasbeenwrittenwith1's,butthelatterhasnothaditsmemorycellsrefreshedandcannotbeprogrammedagainuntilitiserased.JFFS2usesamechanismcalledcleanmarkerstodistinguishbetweenthesetwosituations.Afterasuccessfulblockerase,acleanmarkeriswritten,eithertothebeginningoftheblockortotheOOBareaofthefirstpageoftheblock.Ifthecleanmarkerexists,thenitmustbeacleanblock.
CreatingaJFFS2filesystemCreatinganemptyJFFS2filesystematruntimeisassimpleaserasinganMTDpartitionwithcleanmarkersandthenmountingit.ThereisnoformattingstepbecauseablankJFFS2filesystemconsistsentirelyoffreeblocks.Forexample,toformatMTDpartition6,youwouldenterthesecommandsonthedevice:
#flash_erase-j/dev/mtd600
#mount-tjffs2mtd6/mnt
The-joptiontoflash_eraseaddsthecleanmarkers,andmountingwithtypejffs2presentsthepartitionasanemptyfilesystem.Notethatthedevicetobemountedisgivenasmtd6,not/dev/mtd6.Alternatively,youcangivetheblockdevicenode/dev/mtdblock6.ThisisjustapeculiarityofJFFS2.Oncemounted,youcantreatitlikeanyotherfilesystem.
Youcancreateafilesystemimagedirectlyfromthestagingareaofyourdevelopmentsystemusingmkfs.jffs2towriteoutthefilesinJFFS2format,andsumtooltoaddthesummarynodes.Bothofthesearepartofthemtd-utilspackage.
Asanexample,tocreateanimageofthefilesinrootfsforaNANDflashdevicewithaneraseblocksizeof128KiB(0x20000)andwithsummarynodes,youwouldusethesetwocommands:
$mkfs.jffs2-n-e0x20000-p-d~/rootfs-o~/rootfs.jffs2
$sumtool-n-e0x20000-p-i~/rootfs.jffs2-o~/rootfs-sum.jffs2
The-poptionaddspaddingattheendoftheimagefiletomakeitawholenumberoferaseblocks.The-noptionsuppressesthecreationofcleanmarkersintheimage,whichisnormalforNANDdevices,asthecleanmarkerisintheOOBarea.ForNORdevices,youwouldleaveoutthe-noption.Youcanuseadevicetablewithmkfs.jffs2tosetthepermissionsandtheownershipoffilesbyadding-D[devicetable].Ofcourse,BuildrootandtheYoctoProjectwilldoallthisforyou.
Youcanprogramtheimageintoflashmemoryfromyourbootloader.Forexample,ifyouhaveloadedafilesytemimageintoRAMataddress0x82000000andyouwanttoloaditintoaflashpartitionthatbeginsat0x163000bytesfromthe
startoftheflashchipandis0x7a9d000byteslong,theU-Bootcommandswouldbe:
nanderaseclean1630007a9d000
nandwrite820000001630007a9d000
YoucandothesamethingfromLinuxusingthemtddriverlikethis:
#flash_erase-j/dev/mtd600
#nandwrite/dev/mtd6rootfs-sum.jffs2
TobootwithaJFFS2rootfilesystem,youneedtopassthemtdblockdeviceonthekernelcommandlineforthepartitionandarootfstypebecauseJFFS2cannotbeauto-detected:
root=/dev/mtdblock6rootfstype=jffs2
YAFFS2TheYAFFSfilesystemwaswrittenbyCharlesManning,beginningin2001,specificallytohandleNANDflashchipsatatimewhenJFFS2didnot.Subsequentchangestohandlelarger(2KiB)pagesizesresultedinYAFFS2.ThewebsiteforYAFFSishttp://www.yaffs.net.
YAFFSisalsoalog-structuredfilesystemfollowingthesamedesignprinciplesasJFFS2.Thedifferentdesigndecisionsmeanthatithasafastermount-timescan,simplerandfastergarbagecollection,andhasnocompression,whichspeedsupreadsandwritesattheexpenseoflessefficientuseofstorage.
YAFFSisnotlimitedtoLinux;ithasbeenportedtoawiderangeofoperatingsystems.Ithasaduallicense:GPLv2tobecompatiblewithLinux,andacommerciallicenseforotheroperatingsystems.Unfortunately,theYAFFScodehasneverbeenmergedintomainlineLinux,soyouwillhavetopatchyourkernel.
TogetYAFFS2andpatchakernel,youwouldusethis:
$gitclonegit://www.aleph1.co.uk/yaffs2
$cdyaffs2
$./patch-ker.shcm<pathtoyourlinksource>
Then,configurethekernelwithCONFIG_YAFFS_YAFFS2.
CreatingaYAFFS2filesystemAswithJFFS2,tocreateaYAFFS2filesystematruntime,youonlyneedtoerasethepartitionandmountit,butnotethatinthiscase,youdonotenablecleanmarkers:
#flash_erase/dev/mtd/mtd600
#mount-tyaffs2/dev/mtdblock6/mnt
Tocreateafilesystemimage,thesimplestthingtodoisusethemkyaffs2toolfromhttps://code.google.com/p/yaffs2utilsusingthefollowingcommand:
$mkyaffs2-c2048-s64rootfsrootfs.yaffs2
Here,-cisthepagesizeand-stheOOBsize.Thereisatoolnamedmkyaffs2imagethatispartoftheYAFFScode,butithasacoupleofdrawbacks.Firstly,thepageandOOBsizearehard-codedinthesource:youwillhavetoeditandrecompileifyouhavememorythatdoesnotmatchthedefaultsof2,048and64.Secondly,theOOBlayoutisincompatiblewithMTD,whichusesthefirsttwobytesasabadblockmarker,whereasmkyaffs2imageusesthosebytestostorepartoftheYAFFSmetadata.
TocopytheimagetotheMTDpartitionfromaLinuxshellpromptonthetarget,followthesesteps:
#flash_erase/dev/mtd600
#nandwrite-a/dev/mtd6rootfs.yaffs2
TobootwithaYAFFS2rootfilesystem,addthefollowingtothekernelcommandline:
root=/dev/mtdblock6rootfstype=yaffs2
UBIandUBIFSTheUnsortedBlockImage(UBI)driverisavolumemanagerforflashmemorythattakescareofbadblockhandlingandwearleveling.ItwasimplementedbyArtemBityutskiyandfirstappearedinLinux2.6.22.Inparallelwiththat,engineersatNokiawereworkingonafilesystemthatwouldtakeadvantageofthefeaturesofUBI,whichtheycalledUBIFS;itappearedinLinux2.6.27.SplittingtheflashtranslationlayerinthiswaymakesthecodemoremodularandalsoallowsotherfilesystemstotakeadvantageoftheUBIdriver,asweshallseelateron.
UBIUBIprovidesanidealized,reliableviewofaflashchipbymappingphysicaleraseblocks(PEB)tologicaleraseblocks(LEB).BadblocksarenotmappedtoLEBsandsoareneverused.Ifablockcannotbeerased,itismarkedasbadanddroppedfromthemapping.UBIkeepsacountofthenumberoftimeseachPEBhasbeenerasedintheheaderoftheLEBandchangesthemappingtoensurethateachPEBiserasedthesamenumberoftimes.
UBIaccessestheflashmemorythroughtheMTDlayer.Asanextrafeature,itcandivideanMTDpartitionintoanumberofUBIvolumes,whichimproveswearlevelinginthefollowingway:Imaginethatyouhavetwofilesystems,onecontainingfairlystaticdata,forexamplearootfilesystem,andtheothercontainingdatathatisconstantlychanging.
IftheyarestoredinseparateMTDpartitions,thewearlevelingonlyhasaneffectonthesecondone,whereasifyouchoosetostorethemintwoUBIvolumesinasingleMTDpartition,thewearlevelingtakesplaceoverbothareasofthestorage,andthelifetimeoftheflashmemoryisincreased.Thefollowingdiagramillustratesthissituation:
Inthisway,UBIfulfillstwooftherequirementsofaflashtranslationlayer:wearlevelingandbad-blockhandling.
ToprepareanMTDpartitionforUBI,youdon'tuseflash_eraseaswithJFFS2
andYAFFS2.Instead,youusetheubiformatutility,whichpreservestheerasecountsthatarestoredinthePEBheaders.ubiformatneedstoknowtheminimumunitofI/O,whichformostNANDflashchipsisthepagesize,butsomechipsallowreadingandwritinginsubpagesthatareahalforaquarterofthepagesize.Consultthechipdatasheetfordetailsand,ifindoubt,usethepagesize.Thisexamplepreparesmtd6usingapagesizeof2048bytes:
#ubiformat/dev/mtd6-s2048
ubiformat:mtd0(nand),size134217728bytes(128.0MiB),
1024eraseblocksof131072bytes(128.0KiB),
min.I/Osize2048bytes
ThenyoucanusetheubiattachcommandtoloadtheUBIdriveronanMTDpartitionthathasbeenpreparedinthisway:
#ubiattach-p/dev/mtd6-O2048
UBIdevicenumber0,total1024LEBs(130023424bytes,124.0MiB),
available998LEBs(126722048bytes,120.9MiB),
LEBsize126976bytes(124.0KiB)
Thiscreatesthedevicenode/dev/ubi0throughwhichyoucanaccesstheUBIvolumes.YoucanuseubiattachonseveralMTDpartitions,inwhichcasetheycanbeaccessedthrough/dev/ubi1,/dev/ubi2,andsoon.NotethatsinceeachLEBhasaheadercontainingthemetainformationusedbyUBI,theLEBissmallerthanthePEBbytwopages.Forexample,achipwithaPEBsizeof128KiBand2KiBpageswouldhaveanLEBof124KiB.ThisisimportantinformationthatyouwillneedwhencreatingaUBIFSimage.
ThePEB-to-LEBmappingisloadedintomemoryduringtheattachphase,aprocessthattakestimeproportionaltothenumberofPEBs,typicallyafewseconds.AnewfeaturewasaddedinLinux3.7calledtheUBIfastmap,whichcheckpointsthemappingtoflashfromtimetotimeandsoreducestheattachtime.ThekernelconfigurationoptionisCONFIG_MTD_UBI_FASTMAP.
ThefirsttimeyouattachtoanMTDpartitionafteraubiformat,therewillbenovolumes.Youcancreatevolumesusingubimkvol.Forexample,supposeyouhavea128MiBMTDpartitionandyouwanttosplititintotwovolumes;thefirstistobe32MiBinsizeandthesecondwilltakeuptheremainingspace:
#ubimkvol/dev/ubi0-Nvol_1-s32MiB
VolumeID0,size265LEBs(33648640bytes,32.1MiB),
LEBsize126976bytes(124.0KiB),dynamic,name"vol_1",alignment1
#ubimkvol/dev/ubi0-Nvol_2-m
VolumeID1,size733LEBs(93073408bytes,88.8MiB),
LEBsize126976bytes(124.0KiB),dynamic,name"vol_2",alignment1
Now,youhaveadevicewiththenodes/dev/ubi0_0and/dev/ubi0_1.Youcanconfirmthesituationusingubinfo:
#ubinfo-a/dev/ubi0
ubi0
Volumescount:2
Logicaleraseblocksize:126976bytes,124.0KiB
Totalamountoflogicaleraseblocks:1024(130023424bytes,124.0MiB)
Amountofavailablelogicaleraseblocks:0(0bytes)
Maximumcountofvolumes128
Countofbadphysicaleraseblocks:0
Countofreservedphysicaleraseblocks:20
Currentmaximumerasecountervalue:1
Minimuminput/outputunitsize:2048bytes
Characterdevicemajor/minor:250:0
Presentvolumes:0,1
VolumeID:0(onubi0)
Type:dynamic
Alignment:1
Size:265LEBs(33648640bytes,32.1MiB)
State:OK
Name:vol_1
Characterdevicemajor/minor:250:1
-----------------------------------
VolumeID:1(onubi0)
Type:dynamic
Alignment:1
Size:733LEBs(93073408bytes,88.8MiB)
State:OK
Name:vol_2
Characterdevicemajor/minor:250:2
Atthispoint,youhavea128MiBMTDpartitioncontainingtwoUBIvolumesofsizes32MiBand88.8MiB.Thetotalstorageavailableis32MiBplus88.8MiB,whichequals120.8MiB.Theremainingspace,7.2MiB,istakenupbytheUBIheadersatthestartofeachPEBandspacereservedformappingoutblocksthatgobadduringthelifetimeofthechip.
UBIFSUBIFSusesaUBIvolumetocreatearobustfilesystem.Itaddssub-allocationandgarbagecollectiontocreateacompleteflashtranslationlayer.UnlikeJFFS2andYAFFS2,itstoresindexinformationon-chip,andsomountingisfast,althoughdon'tforgetthatattachingtheUBIvolumebeforehandmaytakeasignificantamountoftime.Italsoallowswrite-backcachingasinanormaldiskfilesystem,whichmeansthatwritesaremuchfaster,butwiththeusualproblemofpotentiallossofdatathathasnotbeenflushedfromthecachetoflashmemoryintheeventofpowerdown.Youcanresolvetheproblembymakingcarefuluseofthefsync(2)andfdatasync(2)functionstoforceaflushoffiledataatcrucialpoints.
UBIFShasajournalforfastrecoveryintheeventofpowerdown.Theminimumsizeofthejournalis4MiB,soUBIFSisnotsuitableforverysmallflashdevices.
OnceyouhavecreatedtheUBIvolumes,youcanmountthemusingthedevicenodeforthevolume,suchas/dev/ubi0_0,orbyusingthedevicenodeforthewholepartitionplusthevolumename,asshownhere:
#mount-tubifsubi0:vol_1/mnt
CreatingafilesystemimageforUBIFSisatwo-stageprocess:firstyoucreateaUBIFSimageusingmkfs.ubifs,andthenembeditintoaUBIvolumeusingubinize.
Forthefirststage,mkfs.ubifsneedstobeinformedofthepagesizewith-m,thesizeoftheUBILEBwith-e,andthemaximumnumberoferaseblocksinthevolumewith-c.Ifthefirstvolumeis32MiBandaneraseblockis128KiB,thenthenumberoferaseblocksis256.So,totakethecontentsofthedirectoryrootfsandcreateaUBIFSimagenamedrootfs.ubi,youwouldtypethefollowing:
$mkfs.ubifs-rrootfs-m2048-e124KiB-c256-orootfs.ubi
Thesecondstagerequiresyoutocreateaconfigurationfileforubinize,whichdescribesthecharacteristicsofeachvolumeintheimage.Thehelppage(ubinize
-h)givesdetailsoftheformat.Thisexamplecreatestwovolumes,vol_1andvol_2:
[ubifsi_vol_1]
mode=ubi
image=rootfs.ubi
vol_id=0
vol_name=vol_1
vol_size=32MiB
vol_type=dynamic
[ubifsi_vol_2]
mode=ubi
image=data.ubi
vol_id=1
vol_name=vol_2
vol_type=dynamic
vol_flags=autoresize
Thesecondvolumehasanauto-resizeflagandsowillexpandtofilltheremainingspaceontheMTDpartition.Onlyonevolumecanhavethisflag.Fromthisinformation,ubinizewillcreateanimagefilenamedbythe-oparameter,withthePEBsize-p,thepagesize-m,andthesub-pagesize-s:
$ubinize-o~/ubi.img-p128KiB-m2048-s512ubinize.cfg
Toinstallthisimageonthetarget,youwouldenterthesecommandsonthetarget:
#ubiformat/dev/mtd6-s2048
#nandwrite/dev/mtd6/ubi.img
#ubiattach-p/dev/mtd6-O2048
IfyouwanttobootwithaUBIFSrootfilesystem,youwouldprovidethesekernelcommand-lineparameters:
ubi.mtd=6root=ubi0:vol_1rootfstype=ubifs
FilesystemsformanagedflashAsthetrendtowardsmanagedflashtechnologiescontinues,particularlyeMMC,weneedtoconsiderhowtouseiteffectively.Whiletheyappeartohavethesamecharacteristicsasharddiskdrives,theunderlyingNANDflashchipshavethelimitationsoflargeeraseblockswithlimitederasecyclesandbadblockhandling.And,ofcourse,weneedrobustnessintheeventoflosingpower.
Itispossibletouseanyofthenormaldiskfilesystems,butweshouldtrytochooseonethatreducesdiskwritesandhasafastrestartafteranunscheduledshutdown.
FlashbenchTomakeoptimumuseoftheunderlyingflashmemory,youneedtoknowtheeraseblocksizeandpagesize.Manufacturersdonotpublishthesenumbersasarule,butitispossibletodeducethembyobservingthebehaviorofthechiporcard.
Flashbenchisonesuchtool.ItwasinitiallywrittenbyArndBergman,asdescribedintheLWNarticleavailableathttp://lwn.net/Articles/428584.Youcangetthecodefromhttps://github.com/bradfa/flashbench.
HereisatypicalrunonaSanDisk4GBSDHCcard:
$sudo./flashbench-a/dev/mmcblk0--blocksize=1024
align536870912pre4.38mson4.48mspost3.92msdiff332µs
align268435456pre4.86mson4.9mspost4.48msdiff227µs
align134217728pre4.57mson5.99mspost5.12msdiff1.15ms
align67108864pre4.95mson5.03mspost4.54msdiff292µs
align33554432pre5.46mson5.48mspost4.58msdiff462µs
align16777216pre3.16mson3.28mspost2.52msdiff446µs
align8388608pre3.89mson4.1mspost3.07msdiff622µs
align4194304pre4.01mson4.89mspost3.9msdiff940µs
align2097152pre3.55mson4.42mspost3.46msdiff917µs
align1048576pre4.19mson5.02mspost4.09msdiff876µs
align524288pre3.83mson4.55mspost3.65msdiff805µs
align262144pre3.95mson4.25mspost3.57msdiff485µs
align131072pre4.2mson4.25mspost3.58msdiff362µs
align65536pre3.89mson4.24mspost3.57msdiff511µs
align32768pre3.94mson4.28mspost3.6msdiff502µs
align16384pre4.82mson4.86mspost4.17msdiff372µs
align8192pre4.81mson4.83mspost4.16msdiff349µs
align4096pre4.16mson4.21mspost4.16msdiff52.4µs
align2048pre4.16mson4.16mspost4.17msdiff9ns
Theflashbenchreadsblocksof,inthiscase,1,024bytesjustbeforeandjustaftervariouspower-of-twoboundaries.Asyoucrossapageoreraseblockboundary,thereadsaftertheboundarytakelonger.Therightmostcolumnshowsthedifferenceandistheonethatismostinteresting.Readingfromthebottom,thereisabigjumpat4KiB,whichisthemostlikelysizeofapage.Thereisasecondjumpfrom52.4µsto349µsat8KiB.Thisisfairlycommonandindicatesthatthecardcanusemulti-planeaccessestoreadtwo4KiBpagesatthesametime.Beyondthat,thedifferencesarelesswellmarked,butthereisaclearjumpfrom485µsto805µsat512KiB,whichisprobablytheeraseblocksize.Giventhatthecardbeingtestedisquiteold,thesearethesortofnumbersyouwould
expect.
DiscardandTRIMUsually,whenyoudeleteafile,onlythemodifieddirectorynodeiswrittentostorage,whilethesectorscontainingthefile'scontentsremainunchanged.Whentheflashtranslationlayerisinthediskcontroller,aswithmanagedflash,itdoesnotknowthatthisgroupofdisksectorsnolongercontainsusefuldataandsoitendsupcopyingstaledata.
Inthelastfewyears,theadditionoftransactionsthatpassinformationaboutdeletedsectorsdowntothediskcontrollerhasimprovedthesituation.TheSCSIandSATAspecificationshaveaTRIMcommandandMMChasasimilarcommandnamedERASE.InLinux,thisfeatureisknownasdiscard.
Tomakeuseofdiscard,youneedastoragedevicethatsupportsit—mostcurrenteMMCchipsdo—andaLinuxdevicedrivertomatch.Youcancheckbylookingattheblocksystemqueueparametersin/sys/block/<blockdevice>/queue/.Theonesofinterestareasfollows:
discard_granularity:Thesizeoftheinternalallocationunitofthedevicediscard_max_bytes:Themaximumnumberofbytesthatcanbediscardedinonegodiscard_zeroes_data:If1,discardeddatawillbesetto0
Ifthedeviceorthedevicedriverdonotsupportdiscard,thesevaluesareallsetto0.Asanexample,thesearetheparametersyouwillseefromthe2GiBeMMCchiponmyBeagleBoneBlack:
#grep-s""/sys/block/mmcblk0/queue/discard_*
/sys/block/mmcblk0/queue/discard_granularity:2097152
/sys/block/mmcblk0/queue/discard_max_bytes:2199023255040
/sys/block/mmcblk0/queue/discard_zeroes_data:1
Thereismoreinformationinthekerneldocumentationfile,Documentation/block/queue-sysfs.txt.
Youcanenablediscardwhenmountingafilesystembyaddingtheoption-odiscardtothemountcommand.Bothext4andF2FSsupportit.
Makesurethatthestoragedevicesupportsdiscardbeforeusingthe-odiscardmountoption,ordatalosscanoccur.
Itisalsopossibletoforcediscardfromthecommandlineindependentlyofhowthepartitionismountedusingthefstrimcommand,whichispartoftheutil-linuxpackage.Typically,youwouldrunthiscommandperiodicallytofreeupunusedspace.fstrimoperatesonamountedfilesystem,sototrimtherootfilesystem,/,youwouldtypethefollowing:
#fstrim-v/
/:2061000704bytesweretrimmed
Theprecedingexampleusestheverboseoption,-v,sothatitprintsoutthenumberofbytespotentiallyfreedup.Inthiscase,2,061,000,704istheapproximateamountoffreespaceinthefilesystem,soitisthemaximumamountofstoragethatcouldhavebeentrimmed.
Ext4Theextendedfilesystem,ext,hasbeenthemainfilesystemforLinuxdesktopssince1992.Thecurrentversion,ext4,isverystableandwelltestedandhasajournalthatmakesrecoveryfromanunscheduledshutdownfastandmostlypainless.ItisagoodchoiceformanagedflashdevicesandyouwillfindthatitisthepreferredfilesystemforAndroiddevicesthathaveeMMCstorage.Ifthedevicesupportsdiscard,youcanmountwiththeoption-odiscard.
Toformatandcreateanext4filesystematruntime,youwouldtypethefollowing:
#mkfs.ext4/dev/mmcblk0p2
#mount-text4-odiscard/dev/mmcblk0p1/mnt
Tocreateafilesystemimageatbuildtime,youcanusethegenext2fsutility,availablefromhttp://genext2fs.sourceforge.net.Inthisexample,Ihavespecifiedtheblocksizewith-Bandthenumberofblocksintheimagewith-b:
$genext2fs-B1024-b10000-drootfsrootfs.ext4
Thegenext2fscanmakeuseofadevicetabletosetthefilepermissionsandownership,asdescribedinChapter5,BuildingaRootFilesystem,with-D[filetable].
Asthenameimplies,thiswillactuallygenerateanimageinExt2format.YoucanupgradetoExt4usingtune2fsasfollows(detailsofthecommandoptionscanbefoundinthemanualpagetune2fs(8)):
$tune2fs-j-Jsize=1-Ofiletype,extents,uninit_bg,dir_index\
rootfs.ext4
$e2fsck-pDfrootfs.ext4
BoththeYoctoProjectandBuildrootuseexactlythesestepswhencreatingimagesinExt4format.
Whileajournalisanassetfordevicesthatmaypowerdownwithoutwarning,itdoesaddextrawritecyclestoeachwritetransaction,wearingouttheflashmemory.Ifthedeviceisbatterypowered,especiallyifthebatteryisnotremovable,thechancesofanunscheduledpowerdownaresmallandsoyou
maywanttoleavethejournalout.
F2FSTheFlash-FriendlyFileSystem,knownasF2FS,isalog-structuredfilesystemdesignedformanagedflashdevices,especiallyeMMCchipsandSDcards.ItwaswrittenbySamsungandwasmergedintomainlineLinuxin3.8.Itismarkedexperimental,indicatingthatithasnotbeenextensivelydeployedasyet,butitseemsthatsomeAndroiddevicesareusingit.
F2FStakesintoaccountthepageanderaseblocksizesandtriestoaligndataontheseboundaries.Thelogformatprovidesresilienceinthefaceofpowerdownandalsoprovidesgoodwriteperformance,insometestsshowingatwofoldimprovementoverext4.ThereisagooddescriptionofthedesignofF2FSinthekerneldocumentationinDocumentation/filesystems/f2fs.txt,andtherearereferencesattheendofthechapter.
Themkfs.f2fsutilitycreatesanemptyF2FSfilesystemwiththelabel-l:
#mkfs.f2fs-lrootfs/dev/mmcblock0p1
#mount-tf2fs/dev/mmcblock0p1/mnt
Thereisn't(yet)atooltocreateF2FSfilesystemimagesoffline.
FAT16/32TheoldMicrosoftfilesystems,FAT16andFAT32,continuetobeimportantasacommonformatunderstoodbymostoperatingsystems.WhenyoubuyanSDcardorUSBflashdrive,itisalmostcertaintobeformattedasFAT32and,insomecases,theon-cardmicrocontrollerisoptimizedforFAT32accesspatterns.Also,somebootROMsrequireaFATpartitionforthesecond-stagebootloader,theTIOMAP-basedchipsforexample.However,FATformatsaredefinitelynotsuitableforstoringcriticalfilesbecausetheyarepronetocorruptionandmakepooruseofthestoragespace.
LinuxsupportsFAT16throughthemsdosfilesystemandbothFAT32andFAT16throughthevfatfilesystem.Tomountadevice,sayanSDcard,onthesecondmmchardwareadapter,youwouldtypethis:
#mount-tvfat/dev/mmcblock1p1/mnt
Inthepast,therehavebeenlicensingissueswiththevfatdriver,whichmay(ormaynot)infringeapatentheldbyMicrosoft.
FAT32hasalimitationof32GiBonthedevicesize.DevicesofalargercapacitymaybeformattedusingtheMicrosoftexFATformat,anditisarequirementforSDXCcards.ThereisnokerneldriverforexFAT,butitcanbesupportedbymeansofauserspaceFUSEdriver.SinceexFATisproprietarytoMicrosoft,thereareboundtobelicensingimplicationsifyousupportthisformatonyourdevice.
Read-onlycompressedfilesystemsCompressingdataisusefulifyoudon'thavequiteenoughstoragetofiteverythingin.BothJFFS2andUBIFSdoon-the-flydatacompressionbydefault.However,ifthefilesarenevergoingtobewritten,asisusuallythecasewiththerootfilesystem,youcanachievebettercompressionratiosbyusingaread-onlycompressedfilesystem.Linuxsupportsseveralofthese:romfs,cramfs,andsquashfs.Thefirsttwoareobsoletenow,soIwilldescribeonlysquashfs.
squashfsThesquashfsfilesystemwaswrittenbyPhillipLougherin2002asareplacementforcramfs.Itexistedasakernelpatchforalongtime,eventuallybeingmergedintomainlineLinuxinversion2.6.29in2009.Itisveryeasytouse:youcreateafilesystemimageusingmksquashfsandinstallittotheflashmemory:
$mksquashfsrootfsrootfs.squashfs
Theresultingfilesystemisread-only,sothereisnomechanismtomodifyanyofthefilesatruntime.Theonlywaytoupdateasquashfsfilesystemistoerasethewholepartitionandprograminanewimage.
squashfsisnotbad-blockawareandsomustbeusedwithreliableflashmemorysuchasNORflash.However,itcanbeusedonNANDflashaslongasyouuseUBItocreateanemulated,reliableMTD.YouhavetoenablethekernelconfigurationCONFIG_MTD_UBI_BLOCK,whichwillcreatearead-onlyMTDblockdeviceforeachUBIvolume.ThefollowingdiagramshowstwoMTDpartitions,eachwithaccompanyingmtdblockdevices.ThesecondpartitionisalsousedtocreateaUBIvolumethatisexposedasathird,reliablemtdblockdevice,whichyoucanuseforanyread-onlyfilesystemthatisnotbad-blockaware:
TemporaryfilesystemsTherearealwayssomefilesthathaveashortlifetimeorhavenosignificanceafterareboot.Manysuchfilesareputinto/tmp,andsoitmakessensetokeepthesefilesfromreachingpermanentstorage.
Thetemporaryfilesystem,tmpfs,isidealforthispurpose.YoucancreateatemporaryRAM-basedfilesystembysimplymountingtmpfs:
#mount-ttmpfstmp_files/tmp
Aswithprocfsandsysfs,thereisnodevicenodeassociatedwithtmpfs,soyouhavetosupplyaplace-keeperstring,tmp_filesintheprecedingexample.
Theamountofmemoryusedwillgrowandshrinkasfilesarecreatedanddeleted.ThedefaultmaximumsizeishalfthephysicalRAM.Inmostcases,itwouldbeadisasteriftmpfsgrewtobethatlarge,soitisaverygoodideatocapitwitha-osizeparameter.Theparametercanbegiveninbytes,KiB(k),MiB(m),orGiB(g),likethis,forexample:
#mount-ttmpfs-osize=1mtmp_files/tmp
Inadditionto/tmp,somesubdirectoriesof/varcontainvolatiledata,anditisgoodpracticetousetmpfsforthemaswell,eitherbycreatingaseparatefilesystemforeachor,moreeconomically,usingsymboliclinks.Buildrootdoesitthisway:
/var/cache->/tmp
/var/lock->/tmp
/var/log->/tmp
/var/run->/tmp
/var/spool->/tmp
/var/tmp->/tmp
IntheYoctoProject,/runand/var/volatilearetmpfsmountswithsymboliclinkspointingtothem,asshownhere:
/tmp->/var/tmp
/var/lock->/run/lock
/var/log->/var/volatile/log
/var/run->/run
/var/tmp->/var/volatile/tmp
Makingtherootfilesystemread-onlyYouneedtomakeyourtargetdeviceabletosurviveunexpectedevents,includingfilecorruption,andstillbeabletobootandachieveatleastaminimumleveloffunction.Makingtherootfilesystemread-onlyisakeypartofachievingthisambitionbecauseiteliminatesaccidentaloverwrites.Makingitread-onlyiseasy:replacerwwithroonthekernelcommandlineoruseaninherentlyread-onlyfilesystemsuchassquashfs.However,youwillfindthatthereareafewfilesanddirectoriesthataretraditionallywritable:
/etc/resolv.conf:ThisfileiswrittenbynetworkconfigurationscriptstorecordtheaddressesofDNSnameservers.Theinformationisvolatile,soyousimplyhavetomakeitasymlinktoatemporarydirectory,forexample,/etc/resolv.conf->/var/run/resolv.conf./etc/passwd:Thisfile,alongwith/etc/group,/etc/shadow,and/etc/gshadow,storesuserandgroupnamesandpasswords.Theyneedtobesymbolicallylinkedtoanareaofpersistentstorage./var/lib:Manyapplicationsexpecttobeabletowritetothisdirectoryandtokeeppermanentdatahereaswell.Onesolutionistocopyabasesetoffilestoatmpfsfilesystematboottimeandthenbindmount/var/libtothenewlocationbyputtingasequenceofcommandssuchastheseintooneofthebootscripts:
$mkdir-p/var/volatile/lib
$cp-a/var/lib/*/var/volatile/lib
$mount--bind/var/volatile/lib/var/lib
/var/log:Thisistheplacewheresyslogandotherdaemonskeeptheirlogs.Generally,loggingtoflashmemoryisnotdesirablebecauseofthemanysmallwritecyclesitgenerates.Asimplesolutionistomount/var/logusingtmpfs,makingalllogmessagesvolatile.Inthecaseofsyslogd,BusyBoxhasaversionthatcanlogtoacircularringbuffer.
IfyouareusingtheYoctoProject,youcancreatearead-onlyrootfilesystembyaddingIMAGE_FEATURES="read-only-rootfs"toconf/local.confortoyourimagerecipe.
FilesystemchoicesSofarwehavelookedatthetechnologybehindsolid-statememoryandatthemanytypesoffilesystems.Nowitistimetosummarizetheoptions.Inmostcases,youwillbeabletodivideyourstoragerequirementsintothesethreecategories:
Permanent,read-writedata:Runtimeconfiguration,networkparameters,passwords,datalogs,anduserdataPermanent,read-onlydata:Programs,libraries,andconfigurationsfilesthatareconstant,forexample,therootfilesystemVolatiledata:Temporarystorage,forexample,/tmp
Thechoicesforread-writestorageareasfollows:
NOR:UBIFSorJFFS2NAND:UBIFS,JFFS2,orYAFFS2eMMC:ext4orF2FS
Forread-onlystorage,youcanuseanyofthese,mountedwiththeroattribute.Additionally,ifyouwanttosavespace,youcouldusesquashfs.Finally,forvolatilestorage,thereisonlyonechoice,tmpfs.
FurtherreadingThefollowingresourceshavefurtherinformationaboutthetopicsintroducedinthischapter:
XIP:Thepast,thepresent...thefuture?,VitalyWool,presentationatFOSDEM2007:https://archive.fosdem.org/2007/slides/devrooms/embedded/Vitaly_Wool_XIP.pdfGeneralMTDdocumentation,http://www.linux-mtd.infradead.org/doc/general.htmlOptimizingLinuxwithcheapflashdrives,ArndBergmann:http://lwn.net/Articles/428584/Flashmemorycarddesign:https://wiki.linaro.org/WorkingGroups/KernelArchived/Projects/FlashCardSurveyeMMC/SSDFileSystemTuningMethodology:http://elinux.org/images/b/b6/EMMC-SSD_File_System_Tuning_Methodology_v1.0.pdfFlash-FriendlyFileSystem(F2FS):http://elinux.org/images/1/12/Elc2013_Hwang.pdfAnf2fSteardown:http://lwn.net/Articles/518988/
SummaryFlashmemoryhasbeenthestoragetechnologyofchoiceforembeddedLinuxfromthebeginning,andovertheyears,Linuxhasgainedverygoodsupport,fromlow-leveldriversuptoflash-awarefilesystems,thelatestbeingUBIFS.
Astherateatwhichnewflashtechnologiesareintroducedincreases,itisbecominghardertokeeppacewiththechangesatthehighend.SystemdesignersareincreasinglyturningtomanagedflashintheformofeMMCtoprovideastablehardwareandsoftwareinterfacethatisindependentofthememorychipsinside.EmbeddedLinuxdevelopersarebeginningtogettogripswiththesenewchips.SupportforTRIMinext4andF2FSiswellestablished,anditisslowlyfindingitswayintothechipsthemselves.Also,theappearanceofnewfilesystemsthatareoptimizedtomanageflash,suchasF2FS,isawelcomestepforward.
However,thefactremainsthatflashmemoryisnotthesameasaharddiskdrive.Youhavetobecarefultominimizethenumberoffilesystemwrites—especiallyasthehigherdensityTLCchipsmaybeabletosupportasfewas1,000erasecycles.
Inthenextchapter,IwillcontinueonthethemeofstorageoptionsasIconsiderdifferentwaystokeepthesoftwareuptodateondevicesthatmaybedeployedtoremotelocations.
UpdatingSoftwareintheFieldInpreviouschapters,wediscussedvariouswaystobuildthesoftwareforaLinuxdeviceandalsohowtocreatesystemimagesforvarioustypesofmassstorage.Whenyougointoproduction,youjustneedtocopythesystemimagetotheflashmemoryanditisreadytobedeployed.Now,Iwanttoconsiderthelifeofthedevicebeyondthefirstshipment.
AswemoveintotheeraoftheInternetofThings,thedevicesthatwecreateareverylikelytobeconnectedtogetherbytheinternet.Atthesametime,thesoftwareisbecomingexponentiallymorecomplex.Moresoftwaremeansmorebugs.Connectiontotheinternetmeansthosebugscanbeexploitedfromafar.Consequentially,wehaveacommonrequirementtobeabletoupdatethesoftwareinthefield.Softwareupdatebringsmoreadvantagesthanfixingbugs,however.Itopensthedoortoaddingvaluetoexistinghardwarebyimprovingsystemperformanceovertimeorenablingfeatures.
Therearemanyapproachestosoftwareupdate.Broadly,Icharacterizethemas:
Localupdate,oftenperformedbyatechnicianwhocarriestheupdateonaportablemediumsuchasaUSBflashdriveoranSDcard,andhastoaccesseachsystemindividuallyRemoteupdate,wheretheupdateisinitiatedbytheuseroratechnicianlocally,butitisdownloadedfromaremoteserverOver-the-air(OTA)update,wheretheupdateispushedandmanagedentirelyremotely,withoutanyneedforlocalinput
Iwillbeginbydescribingseveralapproachestosoftwareupdate,andthenIwillshowanexampleusingMender(https://mender.io).
Inthischapter,wewillcoverthesetopics:
Whattoupdate?Thebasicsofsoftwareupdate.Typesofupdatemechanism.OTAupdates.
UsingMenderforlocalupdates.UsingMenderforOTAupdates.
Whattoupdate?EmbeddedLinuxdevicesareverydiverseintheirdesignandimplementation.However,theyallhavethesebasiccomponents:
BootloaderKernelRootfilesystemSystemapplicationsDevice-specificdata
Somecomponentsarehardertoupdatethanothers,assummarizedinthisdiagram:
Let'slookateachcomponentinturn.
BootloaderThebootloaderisthefirstpieceofcodetorunwhentheprocessorispoweredup.Thewaytheprocessorlocatesthebootloaderisverydevicespecific,butinmostcasesthereisonlyonesuchlocation,andsotherecanonlybeonebootloader.Ifthereisnobackup,updatingthebootloaderisrisky:whathappensifthesystempowersdownmidway?Consequently,mostupdatesolutionsleavethebootloaderalone.Thisisnotabigproblem,becausethebootloaderonlyrunsforashorttimeatpower-onandisnotnormallyagreatsourceofrun-timebugs.
KernelTheLinuxkernelisacriticalcomponentthatwillcertainlyneedupdatingfromtimetotime.Thereareseveralpartstothekernel:
Abinaryimageloadedbythebootloader,oftenstoredintherootfilesystem.ManydevicesalsohaveaDeviceTreeBinary(DTB)thatdescribeshardwaretothekernel,andsohastobeupdatedintandem.TheDTBisusuallystoredalongsidethekernelbinary.Theremaybekernelmodulesintherootfilesystem.
ThekernelandDTBmaybestoredintherootfilesystem,solongasthebootloaderhastheabilitytoreadthatfilesystemformat,oritmaybeinadedicatedpartition.Ineithercase,itispossibletohaveredundantcopies.
RootfilesystemTherootfilesystemcontainstheessentialsystemlibraries,utilities,andscriptsneededtomakethesystemwork.Itisverydesirabletobeabletoreplaceandupgradeallofthese.Themechanismdependsontheimplementation.Commonformatsforembeddedrootfilesystemsare:
Ramdisk,loadedfromrawflashmemoryoradiskimageatboot.Toupdateit,youjustneedtooverwritetheramdiskimageandreboot.Read-onlycompressedfilesystems,suchassquashfs,storedinaflashpartition.Sincethesetypesoffilesystemdonotimplementawritefunction,theonlywaytoupdatethemistowriteacompletefilesystemimagetothepartition.Normalfilesystemtypes.Forrawflashmemory,JFFS2andUBIFSformatsarecommon,andformanagedflashmemory,suchaseMMCandSDcards,theformatislikelytobeext4orF2FS.Sincethesearewritableatruntime,itispossibletoupdatethemfilebyfile.
SystemapplicationsThesystemapplicationsarethemainpayloadofthedevice;theyimplementitsprimaryfunction.Assuch,theyarelikelytobeupdatedfrequentlytofixbugsandtoaddfeatures.Theymaybebundledwiththerootfilesystem,butitisalsocommonforthemtobeplacedinaseparatefilesystemtomakeupdatingeasierandtomaintainseparationbetweenthesystemfiles,whichareusuallyopensource,andtheapplicationfiles,whichareoftenproprietary.
Device-specificdataThisisthecombinationoffilesthataremodifiedatruntime,andincludesconfigurationsettings,logs,user-supplieddata,andthelike.Itisnotoftenthattheyneedtobeupdated,buttheydoneedtobepreservedduringanupdate.Suchdataneedstobestoredinapartitionofitsown.
ComponentsthatneedtobeupdatedInsummary,then,anupdatemayincludenewversionsofthekernel,rootfilesystem,andsystemapplications.Thedevicewillhaveotherpartitionsthatshouldnotbedisturbedbyanupdate,asisthecasewiththedeviceruntimedata.
ThebasicsofsoftwareupdateUpdatingsoftwareseems,atfirstsight,tobeasimpletask:youjustneedtooverwritesomefileswithnewcopies.Butthenyourengineer'strainingkicksinasyoubegintorealizeallthethingsthatcouldgowrong.Whatifthepowergoesdownduringtheupdate?Whatifabug,notseenwhiletestingtheupdate,rendersapercentageofthedevicesunbootable?Whatifathirdpartysendsafakeupdatethatenlistsyourdeviceaspartofabotnet?Attheveryleastthesoftwareupdatemechanismmustbe:
Robust,sothatanupdatedoesnotrenderthedeviceunusableFail-safe,sothatthereisafall-backmodeifallelsefailsSecure,topreventthedevicefrombeinghijackedbypeopleinstallingunauthorizedupdates
Inotherwords,weneedasystemthatisnotsusceptibletoMurphy'slaw,whichstatesthatifsomethingcangowrong,thenitwillgowrong,eventually.Someoftheseproblemsarenon-trivial,however.
MakingupdatesrobustYoumightthinkthattheproblemofupdatingLinuxsystemswassolvedalongtimeago--weallhaveLinuxdesktopsthatweupdateregularly(don'twe?).Also,therearevastnumbersofLinuxserversrunningindatacentersthataresimilarlykeptuptodate.However,thereisadifferencebetweenaserverandadevice.Theformerisoperatinginaprotectedenvironment.Itisunlikelytosufferasuddenlossofpowerornetworkconnectivity.Intheunlikelyeventthatanupdatedoesfail,itisalwayspossibletogetaccesstotheserveranduseexternalmechanismstorepeattheinstall.Devices,ontheotherhand,areoftendeployedatremotesiteswithintermittentpowerandapoornetworkconnections,whichmakesitmuchmorelikelythatanupdatewillbeinterrupted.Then,considerthatitmaybeveryexpensivetogetaccesstoadevicetotakeremedialactionoverafailedupdateif,forexample,thedeviceisanenvironmentalmonitoringstationatthetopofamountainorcontrollingthevalvesofanoilwellatthebottomofthesea.Inconsequence,itismuchmoreimportantforembeddeddevicestohavearobustupdatemechanismthatwillnotresultinthesystembecomingunusable.
Thekeywordhereisatomicity.Theupdateasawholemustbeatomic:thereshouldbenostageatwhichpartofthesystemisupdatedbutnototherparts.Theremustbeasingle,uninterruptiblechangetothesystemthatswitchestothenewversionofsoftware.
Thisremovesthemostobviousupdatemechanismfromconsideration:thatofsimplyupdatingindividualfiles,forexample,byextractinganarchiveoverpartsofthefilesystem.Thereisjustnowaytoensurethattherewillbeaconsistentsetoffilesifthesystemisresetduringtheupdate.Evenusingapackagemanagersuchasapt,yum,orzypperdoesnothelp.Ifyoulookattheinternalsofallthesepackagemangers,youwillseethattheydoindeedworkbyextractinganarchiveoverthefilesystemandrunningscriptstoconfigurethepackagebothbeforeandaftertheupdate.Packagemanagersarefinefortheprotectedworldofthedatacenter,orevenyourdesktop,butnotforadevice.
Toachieveatomicity,theupdatemustbeinstalledalongsidetherunningsystem,
andthenaswitchthrowntomovefromtheoldtothenew.Inlatersections,willdescribetwodifferentapproachestoachievingatomicity.Thefirstistohavetwocopiesoftherootfilesystemandothermajorcomponents.Oneislive,whiletheothercanreceiveupdates.Whentheupdateiscomplete,theswitchisthrownsothatonreboot,thebootloaderselectstheupdatedcopy.Thisisknownassymmetricimageupdate,orA/Bimageupdate.Avariantofthisthemeistouseaspecialrecoverymodeoperatingsystemthatisresponsibleforupdatingthemainoperatingsystem.Theguaranteeofatomicityissharedbetweenthebootloaderandtherecoveryoperatingsystem.Thisisknownasasymmetricimageupdate.ItistheapproachtakeninAndroidpriortotheNougat7.xversion.
Thesecondapproachistohavetwoormorecopiesoftherootfilesystemindifferentsubdirectoriesofthesystempartition,andthenusechroot(8)atboottimetoselectoneofthem.OnceLinuxisrunning,theupdateclientcaninstallupdatesintotheotherrootfilesystem,andthenwheneverythingiscompleteandchecked,itcanthrowtheswitchandreboot.Thisisknownasatomicfileupdate,andisexemplifiedbyOSTree.
Makingupdatesfail-safeThenextproblemtoconsideristhatofrecoveringfromanupdatethatwasinstalledcorrectly,butwhichcontainscodethatstopsthesystemfrombooting.Ideally,wewantthesystemtodetectthiscaseandtoreverttoapreviousworkingimage.
Thereareseveralfailuremodesthatcanleadtoanon-operationalsystem.Thefirstisakernelpanic,causedforexamplebyabuginakerneldevicedriver,orbeingunabletoruntheinitprogram.Asensibleplacetostartisbyconfiguringthekerneltorebootanumberofsecondsafterapanic.YoucandothiseitherwhenyoubuildthekernelbysettingCONFIG_PANIC_TIMEOUTorbysettingthekernelcommandlinetopanic.Forexample,toreboot5secondsafterapanic,youwouldaddpanic=5tothekernelcommandline.
YoumaywanttogofurtherandconfigurethekerneltopaniconanOops.RememberthatanOopsisgeneratedwhenthekernelencountersafatalerror.Insomecases,itwillbeabletorecoverfromtheerror,inothercasesnot,butinallcases,somethinghasgonewrongandthesystemisnotworkingasitshould.ToenablepaniconOopsinthekernelconfiguration,setCONFIG_PANIC_ON_OOPS=yor,onthekernelcommandline,oops=panic.
Asecondfailuremodeoccurswhenthekernellaunchesinitsuccessfullybutforsomereasonthemainapplicationfailstorun.Forthis,youneedawatchdog.Awatchdogisahardwareorsoftwaretimerthatrestartsthesystemifthetimerisnotresetbeforeitexpires.Ifyouareusingsystemd,youcanusetheinbuiltwatchdogfunction,whichI'lldescribeinChapter10,StartingUp–TheinitProgram.Ifnot,youmaywanttoenablethewatchdogsupportbuiltintoLinux,asdescribedinthekernelsourcecodeinDocumentation/watchdog.
Bothfailuresresultinbootloops:eitherakernelpanicorawatchdogtimeoutcausesthesystemtoreboot.Iftheproblemispersistent,thesystemwillrebootcontinually.Tobreakoutofthebootloop,weneedsomecodeinthebootloadertodetectthecaseandtoreverttotheprevious,knowngood,version.Atypicalapproachistouseabootcountthatisincrementedbythebootloaderoneach
boot,andwhichisresettozeroinuserspaceoncethesystemisupandrunning.Ifthesystementersabootloop,thecounterisnotresetandsocontinuestoincrease.Then,thebootloaderisconfiguredtotakeremedialactionifthecounterexceedsathreshold.
InU-Boot,thisishandledbythreevariables:
bootcount:Thisvariableisincrementedeachtimetheprocessorboots
bootlimit:Ifthebootcountexceedsthebootlimit,U-Bootrunsthecommandsinaltbootcmdinsteadofbootcmd
altbootcmd:Thiscontainsthealternativebootcommands,forexampletorollbacktoapreviousversionofsoftwareortostarttherecovery-modeoperatingsystem
Forthistowork,theremustbeawayforauserspaceprogramtoresetthebootcount.WecandothatusingU-BootutilitiesthatallowtheU-Bootenvironmenttobeaccessedatruntime:
fw_printenv:PrintsthevalueofaU-Bootvariablefw_setenv:SetsthevalueofaU-Bootvariable
ThesetwocommandsneedtoknowwheretheU-Bootenvironmentblockisstored,forwhichthereisaconfigurationfilein/etc/fw_env.config.Forexample,iftheU-Bootenvironmentisstoredatoffset0x800000fromthestartoftheeMMCmemory,withabackupcopyat0x1000000,thentheconfigurationwouldlooklikethis:
#cat/etc/fw_env.config
/dev/mmcblk00x8000000x40000
/dev/mmcblk00x10000000x40000
Thereisonefinalthingtocoverinthissection.Incrementingthebootcountoneachbootandthenresettingitwhentheapplicationbeginstorunleadstounnecessarywritestotheenvironmentblock,wearingouttheflashmemoryandslowingdownsysteminitialization.Topreventhavingtodothisonallreboots,U-Boothasafurthervariablenamedupgrade_available.Ifupgrade_availableis0,thenbootcountisnotincremented.upgrade_availableissetto1afteranupdatehasbeeninstalledsothatthebootcountprotectionisinuseonlywhenitisneeded.
MakingupdatessecureThefinalproblemrelatestothepotentialmisuseoftheupdatemechanismitself.Yourprimeintentionwhenimplementinganupdatemechanismistoprovideareliable,automatedorsemi-automatedmethodtoinstallsecuritypatchesandnewfeatures.However,othersmayusethesamemechanismtoinstallunauthorizedversionsofsoftwareandsohijackthedevice.Weneedtolookathowwecanensurethatthiscannothappen.
Thebiggestvulnerabilityisthatofafakeremoteupdate.Topreventthis,weneedtoauthenticatetheupdateserverbeforestartingthedownload.Wealsoneedasecuretransferchannel,suchasHTTPS,toguardagainsttamperingwiththedownloadstream.IwillreturntothiswhendescribingOTAupdateslateron.
Thereisalsothequestionoftheauthenticityofupdatessuppliedlocally.Onewaytodetectabogusupdateistouseasecurebootprotocolinthebootloader.Ifthekernelimageissignedatthefactorywithadigitalkey,thebootloadercancheckthekeybeforeitloadsthekernelandrefusetoloaditifthekeysdonotmatch.Solongasthekeysarekeptprivatebythemanufacturer,itwillnotbepossibletoloadakernelthatisnotauthorized.U-Bootimplementssuchamechanism,whichisdescribedintheU-Bootsourcecodeindoc/uImage.FIT/verified-boot.txt.
Secureboot:goodorbad?IfIhavepurchasedadevicethathasasoftwareupdatefeature,thenIamtrustingthevendorofthatdevicetodeliverusefulupdates.Idefinitelydonotwantamaliciousthirdpartytoinstallsoftwarewithoutmyknowledge.ButshouldIbeallowedtoinstallsoftwaremyself?IfIownthedeviceoutright,shouldInotbeentitledtomodifyit,includingloadingnewsoftware?RecalltheTiVoset-topbox,whichultimatelyledtothecreationoftheGPLv3license.RememberalsotheLynksysWRT54GWi-Firouter:whenaccesstothehardwarebecameeasy,itspawnedawholenewindustry,includingtheOpenWrtproject.See,forexample,http://www.wi-fiplanet.com/tutorials/article.php/3562391formoredetails.Thisisa
complexissuethatsitsatthecrossroadsbetweenfreedomandcontrol.Itismyopinionthatsomedevicemanufacturersusesecurityasanexcusetoprotecttheir,sometimesshoddy,software.
TypesofupdatemechanismInthissection,Iwilldescribethreeapproachestoapplyingsoftwareupdates:symmetric,orA/B,imageupdate;asymmetricimageupdate,alsoknownasrecoverymodeupdate;andfinally,atomicfileupdate.
SymmetricimageupdateInthisscheme,therearetwocopiesoftheoperatingsystem,eachcomprisingtheLinuxkernel,rootfilesystem,andsystemapplications.TheyarelabelledasAandBinthefollowingdiagram:
Thebootloaderhasaflagthatindicateswhichitshouldload.Initially,theflagissettoA,sothebootloader,loadsOSimageA.Toinstallanupdate,theupdaterapplication,whichispartoftheoperatingsystem,overwritesOSimageB.Whencomplete,itchangestheBootflagtoBandreboots.Nowthebootloaderwillloadthenewoperatingsystem.Whenafurtherupdateisinstalled,theupdateroverwritesimageAandchangestheBootflagtoA,andsoyouping-pongbetweenthetwocopies.IfanupdatefailsbeforetheBootflagischanged,thebootloadercontinuestoloadthegoodoperatingsystem.
Thereareseveralopensourceprojectsthatimplementsymmetricimageupdate.OneistheMenderclientoperatinginstandalonemode,whichIwilldescribelateroninthischapter.AnotherisSWUpdate(https://github.com/sbabic/swupdate).SWUpdatecanreceivemultipleimageupdatesinaCPIOformatpackageandthendeploythoseupdatestodifferentpartsofthesystem.ItallowsyoutowritepluginsintheLUAlanguagetodocustomprocessing.IthasfilesystemsupportforrawflashmemorythatisaccessedasMTDflashpartitions,forstorageorganizedintoUBIvolumes,andforSD/eMMCstoragewithadiskpartitiontable.AthirdexampleisRAUC,theRobustAuto-UpdateController,(https://github.com/rauc/rauc).Ittoohassupportforrawflashstorage,UBIvolumes,andSD/eMMCdevices.TheimagescanbesignedandverifiedusingOpenSSLkeys.
Therearesomedrawbackswiththisscheme.Oneisthatbyupdatinganentirefilesystemimage,thesizeoftheupdatepackageislarge,whichcanputastrainonthenetworkinfrastructureconnectingthedevices.Thiscanbemitigatedbysendingonlythefilesystemblocksthathavechangedbyperformingabinarydiffofthenewfilesystemwiththepreviousversion,althoughnoneofthepreviouslymentionedprojectsimplementthisatthetimeofwriting.
Aseconddrawbackistheneedtokeepstoragespaceforaredundantcopyoftherootfilesystemandothercomponents.Iftherootfilesystemisthelargestcomponent,itcomesclosetodoublingtheamountofflashmemoryyouneedtofit.Itisforthisreasonthattheasymmetricupdateschemeisused,whichIdescribenext.
AsymmetricimageupdateYoucanreducestoragerequirementsbykeepingaminimalrecoveryoperatingsystempurelyforupdatingthemainone,asshownhere:
Whenyouwanttoinstallanupdate,settheBootflagtopointtotheRecoveryOSandreboot.OncetheRecoveryOSisrunning,itcanstreamupdatestothemainoperatingsystemimage.Iftheupdateisinterrupted,theBootloaderwillagainbootintotheRecoveryOS,whichcanresumetheupdate.OnlywhentheupdateiscompleteandverifiedwilltheRecoveryOScleartheBootflagandrebootagain,thistimeloadingthenewmainoperatingsystem.Fallbackinthecaseofacorrectbutbuggyupdateistodropthesystembackintorecoverymode,whichcanattemptremedialactions,possiblybyrequestinganearlierupdateversion.
TheRecoveryOSisusuallyalotsmallerthanthemainoperatingsystem,maybeonlyafewmegabytes,andsothestorageoverheadisnotgreat.Asamatterofinterest,thisistheschemethatwasadoptedbyAndroidpriortotheNougatrelease.Foropensourceimplementationsofasymmetricimageupdate,youcouldconsiderSWUpdateorRAUC,bothofwhichImentionedintheprevioussection.
ThemajordrawbackofthisschemeisthatwhiletheRecoveryOSisrunning,thedeviceitnotoperational.
AtomicfileupdatesAnotherapproachistohaveredundantcopiesofarootfilesystempresentinmultipledirectoriesofasinglefilesystemandthenusethechroot(8)commandtochooseoneofthematboottime.Thisallowsonedirectorytreetobeupdatedwhileanotherismountedastherootdirectory.Furthermore,ratherthanmakingcopiesoffilesthathavenotchangedbetweenversionsoftherootfilesystem,youcoulduselinks.Thatwouldsavealotofdiskspacew2andreducetheamountofdatatobedownloadedinanupdatepackage.Thesearethebasicideasbehindatomicfileupdate.
Thechrootcommandrunsaprograminanexistingdirectory.Theprogramseesthisdirectoryasitsrootdirectory,andsocannotaccessanyfilesordirectoriesatahigherlevel.Itisoftenusedtorunaprograminaconstrainedenvironment,whichissometimesrefereedtoachrootjail.
TheOSTreeproject(https://ostree.readthedocs.org/en/latest/),nowrenamedlibOSTree,isthemostpopularimplementationofthisidea.OSTreestartedaround2011asameansofdeployingupdatestotheGNOMEdesktopdevelopers,andtoimprovetheircontinuousintegrationtesting(https://wiki.gnome.org/Projects/GnomeContinuous).Ithassincebeenadoptedasanupdatesolutionforembeddeddevices.ItisoneoftheupdatemethodsavailableinAutomotiveGradeLinux(AGL),anditisavailableintheYoctoProjectthroughthemeta-updatelayer,whichissupportedbyAdvancedTelematicSystems(ATS).
WithOSTree,thefilesarestoredonthetargetindirectory/ostree/repo/objects.Theyaregivennamessuchthatseveralversionofthesamefilecanexistintherepository.Then,agivensetoffilesarelinkedintoadeploymentdirectory,whichhasanamesuchas/ostree/deploy/os/29ff9…/.Thisisrefereedtoascheckingout,sinceithassomesimilaritiestothewayabranchischeckedoutofaGitrepository.Eachdeploydirectorycontainsthefilesthatmakeuparootfilesystem.Therecanbeanynumberofthem,butbydefaultthereareonlytwo.Forexample,herearetwodeploydirectories,eachwithlinksbackintotherepodirectory.
/ostree/repo/objects/...
/ostree/deploy/os/a3c83.../
/usr/bin/bash
/usr/bin/echo
/ostree/deploy/os/29ff9.../
/usr/bin/bash
/usr/bin/echo
Thebootloaderbootsthekernelwithaninitramfs,passingonthekernelcommand-linethepathofthedeploymenttouse:
bootargs=ostree=/ostree/deploy/os/deploy/29ff9...
Theinitramfscontainsaninitprogram,ostree-init,whichreadsthecommandlineandexecutesthechroottothepathgiven.
Whenasystemupdateisinstalled,thefilesthathavechangedaredownloadedintotherepodirectorybytheOSTreeinstallagent.Whencomplete,anewdeploydirectoryiscreated,withlinkstothecollectionoffilesthatwillmakeupthenewrootfilesystem.Someofthesewillbenewfiles,somewillbethesameasbefore.
Finally,itwillchangethebootloader'sbootflagsothatonthenextrebootitwillchroottothenewdeploydirectory.Thebootloaderimplementsthecheckonbootcountandfallsbacktothepreviousrootifabootloopisdetected.
OTAupdatesUpdatingOTAmeanshavingtheabilitytopushsoftwaretoadeviceorgroupofdevicesviaanetwork,usuallywithoutanyenduserinteractionwiththedevice.Forthistohappenweneedacentralservertocontroltheupdateprocessandaprotocolfordownloadingtheupdatetotheupdateclient.Inatypicalimplementation,theclientpollstheupdateserverfromtimetotimetocheckifthereareanyupdatespending.Thepollingintervalneedstobelongenoughthatthepolltrafficdoesnottakeasignificantportionofthenetworkbandwidth,butshortenoughthattheupdatescanbedeliveredinatimelyfashion.Anintervaloftensofminutestoseveralhoursisoftenagoodcompromise.Thepollmessagesfromthedevicecontainsomesortofuniqueidentifier,suchaserialnumberorMACaddress,andthecurrentsoftwareversion.Fromthistheupdateservercanseeifanupdateisneeded.Thepollmessagesmayalsocontainotherstatusinformation,suchasuptime,environmentalparametersoranythingthatwouldbeusefulforcentralmanagementofthedevices.
Theupdateserveritusuallylinkedtoamanagementsystemthatwillassignnewversionsofsoftwaretothevariouspopulationsofdevicesunderitscontrol.Ifthedevicepopulationislargeitmaysendupdatesinbatchestoavoidoverloadingthenetwork.Therewillbesomesoftofstatusdisplaywherethecurrentstateofthedevicescanbeshownandproblemshighlighted.
Ofcourse,theupdatemechanismmustbesecuresothatfakeupdatescannotbesenttotheenddevices.Thisinvolvestheclientandserverbeingabletoauthenticateeachotherbyanexchangeofcertificates.Thentheclientcanvalidatethepackagesdownloadedaresignedbythekeythatisexpected.
HerearetwoexamplesofopensourceprojectsthatyoucanuseforOTAupdate:
MenderinmanagedmodeThehawkBit(https://projects.eclipse.org/proposals/hawkbit)inconjunctionwithanupdaterclientsuchasSWUpdateorRAUC
UsingMenderforlocalupdatesSomuchforthetheory.InthelasttwosectionsofthischapterIwanttodemonstratetheprinciplesIhavetalkedaboutsofarwithexamplesofsoftwareupdateworkinginpractice.AsthebasisoftheexampleIwillbeusingMender.MenderusesasymmetricA/Bimageupdatemechanism,withfall-backinthecaseofafailedupdate.Itcanoperateinastandalonemodeforlocalupdates,orinmanagedmodeforOTAupdates.Iwillbeginwiththestandalonemode.
Menderiswrittenandsupportedbymender.io(https://mender.io).Thereismuchmoreinformationaboutthesoftwareinthedocumentationsectionofthewebsite.Iwillnotdelvedeeplyintotheconfigurationofthesoftwarehere,sincemyaimistoillustratetheprinciplesofsoftwareupdate.
BuildingtheMenderclientTheMenderclientisavailableasaYoctoProjectmetalayer.TheseexamplesusetheMortyreleaseoftheYoctoProject,whichisthesameonethatweusedinChapter6,SelectingaBuildSystem.Webeginbygettingthemeta-menderlayer,andalsooe-meta-gobecausetheMenderclientiswrittenintheGolanguage:
$cdpoky
$gitclonegit://github.com/mem/oe-meta-go
$gitclone-bmortygit://github.com/mendersoftware/meta-mender
TheMenderclientrequiressomechangestotheconfigurationofU-Boottohandlethebootflagandbootcountvariables.ThestockMenderclientlayerhassub-layersforthreesampleimplementationsofthisU-Bootintegrationthatwecanusestraightoutofthebox:meta-mender-beaglebone,meta-mender-qemu,meta-mender-raspberrypi.WewillbeusingQEMU.Thenextstep,therefore,istocreateabuilddirectoryandaddthelayersforthisconfiguration:
$sourceoe-init-build-envbuild-mender-qemu
$bitbake-layersadd-layer../oe-meta-go
$bitbake-layersadd-layer../meta-mender/meta-mender-core
$bitbake-layersadd-layer../meta-mender/meta-mender-demo
$bitbake-layersadd-layer../meta-mender/meta-mender-qemu
Weneedtosetuptheenvironmentbyaddingsomesettingstoconf/local.conf:
1MENDER_ARTIFACT_NAME="release-1"
2INHERIT+="mender-full"
3MACHINE="vexpress-qemu"
4DISTRO_FEATURES_append="systemd"
5VIRTUAL-RUNTIME_init_manager="systemd"
6DISTRO_FEATURES_BACKFILL_CONSIDERED="sysvinit"
7VIRTUAL-RUNTIME_initscripts=""
8IMAGE_FSTYPES="ext4"
Line2includesaBitBakeclass,namedmender-full,whichisresponsibleforthespecialprocessingoftheimagerequiredtocreatetheA/Bimageformat.Line3selectsamachinenamedvexpress-qemu,whichusesQEMUtoemulateanARMVersatileExpressboard,ratherthantheVersatilePBthatisthedefaultintheYoctoProject.Lines4to7selectsystemdastheinitdaemoninplaceofthedefaultSystemVinit.IdescribeinitdaemonsinmoredetailinChapter10,StartingUp–TheinitProgram.Line8causestherootfilesystemimagestobe
generatedinext4format.
Nowwecanbuildanimage:
$bitbakecore-image-full-cmdline
Asusual,theresultsofthebuildareintmp/deploy/images/vexpress-qemu.YouwillnoticesomenewthingsinherecomparedtotheYoctoProjectbuildswehavedonepreviously.Thereisafilenamedcore-image-full-cmdline-vexpress-qemu-[timestamp].mender,andanothersimilarlynamedfilethatendswithsdimg.Themenderfilewillberequiredwhenweperformanupdatelateron:Iwilltalkmoreaboutitthen.ThesdimgfileiscreatedusingatoolfromtheYoctoProjectknownaswic.TheoutputisanimagethatcontainsapartitiontableandwhichisreadytobecopieddirectlytoanSDcardoreMMCchip.
WecanruntheQEMUtargetusingthescriptprovidedbytheMenderlayer,whichwillfirstbootU-BootandthenloadtheLinuxkernel:
$../meta-mender/meta-mender-qemu/scripts/mender-qemu
[...]
[OK]StartedMenderOTAupdateservice.
[OK]ReachedtargetMulti-UserSystem.
Poky(YoctoProjectReferenceDistro)2.2.1vexpress-qemuttyAMA0
vexpress-qemulogin:
Logonasroot,nopassword.Lookingatthelayoutofthepartitionsonthetarget,wecanseethis:
#fdisk-l/dev/mmcblk0
Disk/dev/mmcblk0:384MiB,402653184bytes,786432sectors
[...]
DeviceBootStartEndSectorsSizeIdType
/dev/mmcblk0p1*49152819193276816McW95FAT32(LBA)
/dev/mmcblk0p281920294911212992104M83Linux
/dev/mmcblk0p3294912507903212992104M83Linux
/dev/mmcblk0p4524287786431262145128MfW95Ext'd(LBA)
/dev/mmcblk0p5524288786431262144128M83Linux
Thereare5partitionsinall:
Partition1:ThiscontainstheU-BootbootfilesPartitions2and3:ThiscontaintheA/Brootfilesystems:atthisstage,theyareidenticalPartition4:Thisisjustanextensionpartitionthatcontainstheremaining
partitionsPartition5:Thiscontainsawritablepartitionthatstoresthedevice-specificdatathatmustnotbeoverwrittenduringanimageupdate
Runningthemountcommandshowsthatthesecondpartitionisbeingusedastherootfilesystem,leavingthethirdtoreceiveupdates:
#mount
/dev/mmcblk0p2on/typeext4(rw,relatime,data=ordered)
[...]
InstallinganupdateNowwewantmakeachangetotherootfilesystemandtheninstallitasanupdate.
Wewillbeginbytakingacopyoftheimagewejustbuilt.Thiswillbetheliveimagethatwearegoingtoupdate.Ifwedon'tdothis,theQEMUscriptwilljustloadthelatestimagegeneratedbyBitBake,includingupdates,whichdefeatstheobjectofthedemonstration:
$cdtmp/deploy/images/vexpress-qemu
$cpcore-image-full-cmdline-vexpress-qemu.sdimg\
core-image-live-vexpress-qemu.sdimg
$cd-
Wearejustgoingtochangethehostnameofthetarget,whichwillbeeasytoseewhenitisinstalled.Todothis,editconf/local.confandaddthisline:
hostname_pn-base-files="vexpress-qemu-release2"
Wecanbuilditinthesamewayasbefore:
$bitbakecore-image-full-cmdline
Thistimewearenotinterestedinthesdimgfile,whichcontainsacompletenewimage.Insteadwewanttotakeonlythenewrootfilesystem,whichisincore-image-full-cmdline-vexpress-qemu.mender.ThemenderfileisinaformatthatisrecognizedbytheMenderclient.Themenderfileformatconsistsofversioninformation,aheader,andtherootfilesystemimageputtogetherinacompressed.tararchive.
Thenextstepistodeploythenewartifacttothetarget,initiatingtheupdatelocallyonthedevice,butreceivingtheupdatefromaserver.So,bootQEMUusingtheoriginalimage:
$../meta-mender/meta-mender-qemu/scripts/mender-qemu\
core-image-live
Checkthatthenetworkisconfigured,withQEMUat10.0.2.15,andthehostat10.0.2.2:
#ping-c110.0.2.2
PING10.0.2.2(10.0.2.2)56(84)bytesofdata.
64bytesfrom10.0.2.2:icmp_seq=1ttl=255time=0.326ms
---10.0.2.2pingstatistics---
1packetstransmitted,1received,0%packetloss,time0ms
rttmin/avg/max/mdev=0.326/0.326/0.326/0.000ms
Now,inanotherTerminalsession,startawebserveronthehostthatcanserveuptheupdate:
$cdtmp/deploy/images/vexpress-qemu
$python-mSimpleHTTPServer
ServingHTTPon0.0.0.0port8000...
Itislisteningonport8000.Whenyouaredonewiththewebserver,typeCtrl-Ctoterminateit.
Backonthetarget,issuethiscommandtogettheupdate:
#mender-log-levelinfo-rootfs\
http://10.0.2.2:8000/core-image-full-cmdline-vexpress-qemu.mender
[...]
INFO[0104]wrote109051904/109051904bytesofupdatetodevice
/dev/mmcblk0p3module=device100%37132KiB
INFO[0106]Enablingpartitionwithnewimageinstalledtobea
bootcandidate:3module=device
Theupdatewaswrittentothethirdpartition,/dev/mmcblk0p3,whileourrootfilesystemisstillonpartition2,mmcblk0p2.
RebootQEMU.Notethatnowtherootfilesystemismountedonpartition3,andthatthehostnamehaschanged:
#mount
/dev/mmcblk0p3on/typeext4(rw,relatime,data=ordered)
[...]
#hostname
vexpress-qemu-release2
Success!
Thereisonemorethingtodo.Weneedtoconsidertheissueofbootloops.Usingfw-printenvtolookattheU-Bootvariables,wesee:
#fw_printenvupgrade_available
upgrade_available=1
#fw_printenvbootcount
bootcount=1
#fw_printenvbootlimit
bootlimit=1
Ifthesystemrebootswithoutclearingthebootcount,U-Bootshoulddetectitandfall-backtothepreviousinstallation.Let'stryitoutbyrebootingthetargetrightaway.
Whenthetargetcomesupagain,weseethathasindeedhappened:
#mount
/dev/mmcblk0p2on/typeext4(rw,relatime,data=ordered)
[...]
#hostname
vexpress-qemu
Now,let'srepeattheupdateprocedure,butthistime,afterthereboot,committhechange:
#mender-commit
[...]
#fw_printenvupgrade_available
upgrade_available=0
#fw_printenvbootcount
bootcount=1
#fw_printenvbootlimit
bootlimit=1
Onceupgrade_availableiscleared,U-Bootwillnolongercheckbootcount,andsothedevicewillcontinuetomountthisupdatedrootfilesystem.Whenafurtherupdateisloaded,theMenderclientwillclearbootcountandsetupgrade_availableonceagain.
ThisexampleusestheMenderclientfromthecommandlinetoinitiateanupdatelocally.Theupdateitselfcamefromaserver,butcouldjustaseasilyhavebeenprovidedonaUSBflashdriveoranSDcard.InplaceofMender,wecouldhaveusedtheotherimageupdateclientsmentioned:SWUpdateorRAUC.Theyeachhavetheiradvantages,butthebasictechniqueisthesame.
UsingMenderforOTAupdatesThenextstageistoseehowOTAupdatesworkinpractice.OnceagainwewillbeusingtheMenderclientonthedevice,butthistimeoperatingitinmanagedmode,andinadditionwewillbeconfiguringaservertodeploytheupdate,sothatnolocalinteractionisneeded.Menderprovideanopensourceserverforthis.
TheinstallationrequiresDockerEngineversion17.0.3orlatertobeinstalled.RefertotheDockerwebsiteathttps://docs.docker.com/engine/installation.ItalsorequiresDockerComposeversion1.6,asdescribedhere:https://docs.docker.com/compose/install/.Oncetheyareinstalled,youcaninstallMenderintegrationenvironment:
$curl-L\
https://github.com/mendersoftware/integration/archive/1.0.1.tar.gz|tarxz
$cdintegration-1.0.1
$./up
OnceyouruntheupscriptyouwillseethatitdownloadsseveralhundredsofmegabytesofDockerimages,whichmaytakesometime,dependingonyourinternetconnectionspeed.Afterawhile,youwillseethatitbootsacopyoftheMenderclientrunninginaQEMUemulation.
Thismeansthattheserverisupandrunning.Donotbemisleadbytheratherverbosemessagelogging,whichmakesitappeartobebusydownloadingstillwhenithasinfactfinished.
TheMenderwebinterfaceisnowrunningonhttps://localhost/.CopythatURLintoawebbrowserandacceptthecertificatewarningthatpopsup.Thisisbecausethewebserviceisusingaself-signedcertificatethatthebrowserwillnotrecognize.Thencreateatestuseraccount:
OntheDashboard,youwillseethatthereisonedevicewaitingforauthorization.ThisistheQEMUclientthatisstartedbythe./upscript,nottheonewecreated.Clickonthegreenarrowtoauthorizeit.
Weneedtomakeachangetotheconfigurationofthetargetsothatitwillpollourlocalserverforupdates.Forthisdemonstration,theserverURLisgoingtobes3.docker.mender.io,whichwemaptotheaddresslocalhostbyappendingalinetothehostsfile.TheywaytodothiswiththeYoctoProjectistocreatealayerwithafilethatappendstotherecipethatcreatesthehostsfile,whichismeta/recipes-core/netbase/netbase_5.3.bb.ThereisasuitablelayerinMELP/chapter_08/meta-ota:
$cdpoky
$cp-aMELP/chapter_08/meta-ota
$sourceoe-init-build-envbuild-mender-qemu
$bitbake-layersadd-layer../meta-ota
Buildthenewimageusingthefollowingcommand:
$bitbakecore-image-full-cmdline
Thentakeacopy.Thiswillbecomeourliveimageforthissection:
$cdtmp/deploy/images/vexpress-qemu
$cpcore-image-full-cmdline-vexpress-qemu.sdimg\
core-image-live-ota-vexpress-qemu.sdimg
$cd-
BootuptheQEMUbuildthatwecreatedearlierinthissection:
$../meta-mender/meta-mender-qemu/scripts/mender-qemu\
core-image-live-ota
Afterafewseconds,youwillseeanewdeviceappearinthedashboardofthewebinterface.Thishappenssoquicklybecauseforthepurposesofdemonstratingthesystem,theMenderclienthasbeenconfiguredtopolltheserverevery5seconds.Amuchlongerpollingintervalwouldbeusedinproduction:30minutesissuggested.Youcanseehowthisisconfiguredbylookingatthefile/etc/mender/mender.confonthetarget:
{
"ClientProtocol":"http",
"HttpsClient":{
"Certificate":"",
"Key":""
},
"RootfsPartA":"/dev/mmcblk0p2",
"RootfsPartB":"/dev/mmcblk0p3",
"UpdatePollIntervalSeconds":5,
"InventoryPollIntervalSeconds":5,
"RetryPollIntervalSeconds":1,
"ServerURL":"https://docker.mender.io",
"ServerCertificate":"/etc/mender/server.crt"
}
AlsointhereyoucanseetheserverURL,andthattheservercertificatehasbeensettothedefault,server.crt.
Inthewebuserinterface,clickthegreenicontoauthorizethenewdevice,andthenclickontheentryforthedevicetoseethedetails:
Now,wecanonceagaincreateanupdateanddeployit,thistimeOTA.conf/local.conf:
MENDER_ARTIFACT_NAME="OTA-update1"
Builditonceagain,producinganewcore-image-full-cmdline-vexpress-qemu.menderintmp/deploy/images.ThenimportthisintothewebinterfacebyclickingontheArtifactstabandnavigatingtothedirectorycontainingtheupdate.Itshouldcopyitintotheserverdatastore,anditshouldappearasanewartifactwiththenameOTA-update1.
TodeploytheupdatetoourQEMUdevice.ClickontheDevicestab,andselectthedevice.ClickontheCreateadeploymentforthisdeviceoptionatthebottom
rightofthedeviceinformation,andselecttheOTA-update1artifact.ItshouldbecomeaPendingdeployment,andthenanInprogressdeployment.AfterawhiletheQEMUclientshouldhavewrittentheupdatetothesparefilesystemimageandthenitwillrebootandcommittheupdate.ThewebUIshouldnowreportitasaPastdeployment,andnowtheclientisrunningOTA-update1.
Afterafewexperimentswiththeserveryoumaywanttoclearthestateandstartalloveragain.Youcandothatwiththesethreecommands,enteredinthedirectoryintegration-1.0.1/:
./stop
./reset
./up
SummaryBeingabletoupdatethesoftwareondevicesinthefieldisattheveryleastausefulattribute,andifthedeviceisconnectedtotheinternet,itbecomesanabsolutemust.Andyet,alltoooftenitisafeaturethatisleftuntilthelastpartofaproject,ontheassumptionthatitisnotahardproblemtosolve.Inthischapter,IhopethatIhaveillustratedtheproblemsthatareassociatedwithdesigninganeffectiveandrobustupdatemechanism,andalsothatthereareseveralopen-sourceoptionsreadilyavailable.Youdonothavetoreinventthewheelanymore.
Theapproachusedmostoften,andalsotheonewithmostreal-worldtesting,isthesymmetricimage(A/B)update,oritscousintheasymmetric(recovery)imageupdate.Here,youhavethechoiceofSWUpdate,RAUC,andMender.Amorerecentinnovationistheatomicfileupdate,intheformofOSTree.Thishasgoodcharacteristicsinreducingtheamountofdatathatneedstobedownloadedandofredundantstoragethatneedstobefittedonthetarget.
ItisquitecommontodeployupdatesonasmallscalebyvisitingeachsiteandapplyingtheupdatefromaUSBmemorystickorSDcard.But,ifyouwanttodeploytoremotelocations,ordeployatscale,anOverTheAirupdateoptionwillbeneeded.
Thenextchapterdescribeshowyoucontrolthehardwarecomponentsofyoursystemthroughtheuseofdevicedrivers,bothintheconventionalsenseofdriversthatarepartofthekernel,andalsotheextenttowhichyoucancontrolhardwarefromtheuserspace.
InterfacingwithDeviceDriversKerneldevicedriversarethemechanismthroughwhichtheunderlyinghardwareisexposedtotherestofthesystem.Asadeveloperofembeddedsystems,youneedtoknowhowthesedevicedriversfitintotheoverallarchitectureandhowtoaccessthemfromuserspaceprograms.Yoursystemwillprobablyhavesomenovelpiecesofhardware,andyouwillhavetoworkoutawayofaccessingthem.Inmanycases,youwillfindthattherearedevicedriversprovidedforyou,andyoucanachieveeverythingyouwantwithoutwritinganykernelcode.Forexample,youcanmanipulateGPIOpinsandLEDsusingfilesinsysfs,andtherearelibrariestoaccessserialbuses,includingSPI(SerialPeripheralInterface)andI2C(Inter-IntegratedCircuit).
Therearemanyplacestofindouthowtowriteadevicedriver,butfewtotellyouwhyyouwouldwanttoandthechoicesyouhaveindoingso.ThisiswhatIwanttocoverhere.However,rememberthatthisisnotabookdedicatedtowritingkerneldevicedriversandthattheinformationgivenhereistohelpyounavigatetheterritorybutnotnecessarilytosetuphomethere.Therearemanygoodbooksandarticlesthatwillhelpyoutowritedevicedrivers,someofwhicharelistedattheendofthischapter.
Inthischapterwewillcoverthefollowingtopics:
TheroleofdevicedriversCharacterdevicesBlockdevicesNetworkdevicesFindingoutaboutdriversatruntimeFindingtherightdevicedriverDevicedriversinuserspaceWritingakerneldevicedriverDiscoveringthehardwareconfiguration
TheroleofdevicedriversAsImentionedinChapter4,ConfiguringandBuildingtheKernel,oneofthefunctionsofthekernelistoencapsulatethemanyhardwareinterfacesofacomputersystemandpresenttheminaconsistentmannertouserspaceprograms.Thekernelhasframeworksdesignedtomakeiteasytowriteadevicedriver,whichisthepieceofcodethatmediatesbetweenthekernelaboveandthehardwarebelow.AdevicedrivermaybewrittentocontrolphysicaldevicessuchasaUARToranMMCcontroller,oritmayrepresentavirtualdevicesuchasthenulldevice(/dev/null)oraramdisk.Onedrivermaycontrolmultipledevicesofthesamekind.
Kerneldevicedrivercoderunsatahighprivilegelevel,asdoestherestofthekernel.Ithasfullaccesstotheprocessoraddressspaceandhardwareregisters.ItcanhandleinterruptsandDMAtransfers.Itcanmakeuseofthesophisticatedkernelinfrastructureforsynchronizationandmemorymanagement.However,youshouldbeawarethatthereisadownsidetothis;ifsomethinggoeswronginabuggydriver,itcangoreallywrongandbringthesystemdown.Consequently,thereisaprinciplethatdevicedriversshouldbeassimpleaspossiblebyjustprovidinginformationtoapplicationswheretherealdecisionsaremade.Youoftenhearthisbeingexpressedasnopolicyinthekernel.Itistheresponsibilityofuserspacetosetthepolicythatgovernstheoverallbehaviorofthesystem.Forexample,theloadingofkernelmodulesinresponsetoexternalevents,suchasplugginginanewUSBdevice,istheresponsibilityoftheuserspaceprogram,udev,notthekernel.Thekerneljustsuppliesameansofloadingakernelmodule.
InLinux,therearethreemaintypesofdevicedriver:
Character:ThisisforanunbufferedI/Owitharichrangeoffunctionsandathinlayerbetweentheapplicationcodeandthedriver.Itisthefirstchoicewhenimplementingcustomdevicedrivers.Block:ThishasaninterfacetailoredforblockI/Otoandfrommassstoragedevices.Thereisathicklayerofbufferingdesignedtomakediskreadsandwritesasfastaspossible,whichmakesitunsuitableforanythingelse.Network:Thisissimilartoablockdevicebutisusedfortransmittingand
receivingnetworkpacketsratherthandiskblocks.
Thereisalsoafourthtypethatpresentsitselfasagroupoffilesinoneofthepseudofilesystems.Forexample,youmightaccesstheGPIOdriverthroughagroupoffilesin/sys/class/gpio,asIwilldescribelateroninthischapter.Let'sbeginbylookinginmoredetailatthethreebasicdevicetypes.
CharacterdevicesCharacterdevicesareidentifiedinuserspacebyaspecialfilecalledadevicenode.Thisfilenameismappedtoadevicedriverusingthemajorandminornumbersassociatedwithit.Broadlyspeaking,themajornumbermapsthedevicenodetoaparticulardevicedriver,andtheminornumbertellsthedriverwhichinterfaceisbeingaccessed.Forexample,thedevicenodeofthefirstserialportontheARMVersatilePBisnamed/dev/ttyAMA0,andithasmajornumber204andminornumber64.Thedevicenodeforthesecondserialporthasthesamemajornumber,sinceitishandledbythesamedevicedriver,buttheminornumberis65.Wecanseethenumbersforallfourserialportsfromthedirectorylistinghere:
#ls-l/dev/ttyAMA*
crw-rw----1rootroot204,64Jan11970/dev/ttyAMA0
crw-rw----1rootroot204,65Jan11970/dev/ttyAMA1
crw-rw----1rootroot204,66Jan11970/dev/ttyAMA2
crw-rw----1rootroot204,67Jan11970/dev/ttyAMA3
ThelistofstandardmajorandminornumberscanbefoundinthekerneldocumentationinDocumentation/devices.txt.ThelistdoesnotgetupdatedveryoftenanddoesnotincludethettyAMAdevicedescribedintheprecedingparagraph.Nevertheless,ifyoulookatthekernelsourcecodeindrivers/tty/serial/amba-pl011.c,youwillseewherethemajorandminornumbersaredeclared:
#defineSERIAL_AMBA_MAJOR204
#defineSERIAL_AMBA_MINOR64
Wherethereismorethanoneinstanceofadevice,aswiththettyAMAdriver,theconventionforformingthenameofthedevicenodeistotakeabasename,ttyAMA,andappendtheinstancenumberfrom0to3inthisexample.
AsImentionedinChapter5,BuildingaRootFilesystem,thedevicenodescanbecreatedinseveralways:
devtmpfs:Thedevicenodeiscreatedwhenthedevicedriverregistersanewdeviceinterfaceusingabasenamesuppliedbythedriver(ttyAMA)andaninstancenumber.
udevormdev(withoutdevtmpfs):Essentiallythesameaswithdevtmpfs,exceptthatauserspacedaemonprogramhastoextractthedevicenamefromsysfsandcreatethenode.Iwilltalkaboutsysfslater.mknod:Ifyouareusingstaticdevicenodes,theyarecreatedmanuallyusingmknod.
YoumayhavetheimpressionfromthenumbersIhaveusedabovethatbothmajorandminornumbersare8-bitnumbersintherange0to255.Infact,fromLinux2.6onwards,themajornumberis12bitslong,whichgivesvalidnumbersfrom1to4,095,andtheminornumberis20bits,from0to1,048,575.
Whenyouopenacharacterdevicenode,thekernelcheckstoseewhetherthemajorandminornumbersfallintoarangeregisteredbyacharacterdevicedriver.Ifso,itpassesthecalltothedriver,otherwisetheopencallfails.Thedevicedrivercanextracttheminornumbertofindoutwhichhardwareinterfacetouse.
Towriteaprogramthataccessesadevicedriver,youhavetohavesomeknowledgeofhowitworks.Inotherwords,adevicedriverisnotthesameasafile:thethingsyoudowithitchangethestateofthedevice.Asimpleexampleisthepseudorandomnumbergenerator,urandom,whichreturnsbytesofrandomdataeverytimeyoureadit.Hereisaprogramthatdoesjustthis(youwillfindthecodeinMELP/chapter_09/read-urandom):
#include<stdio.h>
#include<sys/types.h>
#include<sys/stat.h>
#include<fcntl.h>
#include<unistd.h>
intmain(void)
{
intf;
unsignedintrnd;
intn;
f=open("/dev/urandom",O_RDONLY);
if(f<0){
perror("Failedtoopenurandom");
return1;
}
n=read(f,&rnd,sizeof(rnd));
if(n!=sizeof(rnd)){
perror("Problemreadingurandom");
return1;
}
printf("Randomnumber=0x%x\n",rnd);
close(f);
return0;
}
ThenicethingabouttheUnixdrivermodelisthatonceweknowthatthereisadevicenamedurandomandthateverytimewereadfromit,itreturnsafreshsetofpseudorandomdata,wedon'tneedtoknowanythingelseaboutit.Wecanjustusestandardfunctionssuchasopen(2),read(2),andclose(2).
YoucouldusethestreamI/Ofunctions,fopen(3),fread(3),andfclose(3)instead,butthebufferingimplicitinthesefunctionsoftencausesunexpectedbehavior.Forexample,fwrite(3)usuallyonlywritestotheuserspacebuffer,nottothedevice.Youwouldneedtocallfflush(3)toforcethebuffertobewrittenout.Therefore,itisbesttonotusestreamI/Ofunctionswhencallingdevicedrivers.
BlockdevicesBlockdevicesarealsoassociatedwithadevicenode,whichalsohasmajorandminornumbers.
Althoughcharacterandblockdevicesareidentifiedusingmajorandminornumbers,theyareindifferentnamespaces.Acharacterdriverwithamajornumber4isinnowayrelatedtoablockdriverwithamajornumber4.
Withblockdevices,themajornumberisusedtoidentifythedevicedriverandtheminornumberisusedtoidentifythepartition.Let'slookattheMMCdriverontheBeagleBoneBlackasanexample:
#ls-l/dev/mmcblk*
brw-rw----1rootdisk179,0Jan12000/dev/mmcblk0
brw-rw----1rootdisk179,1Jan12000/dev/mmcblk0p1
brw-rw----1rootdisk179,2Jan12000/dev/mmcblk0p2
brw-rw----1rootdisk179,8Jan12000/dev/mmcblk1
brw-rw----1rootdisk179,16Jan12000/dev/mmcblk1boot0
brw-rw----1rootdisk179,24Jan12000/dev/mmcblk1boot1
brw-rw----1rootdisk179,9Jan12000/dev/mmcblk1p1
brw-rw----1rootdisk179,10Jan12000/dev/mmcblk1p2
Here,mmcblk0isthemicroSDcardslot,whichhasacardthathastwopartitions,andmmcblk1istheeMMCchipthatalsohastwopartitions.ThemajornumberfortheMMCblockdriveris179(youcanlookitupindevices.txt).TheminornumbersareusedinrangestoidentifydifferentphysicalMMCdevices,andthepartitionsofthestoragemediumthatareonthatdevice.InthecaseoftheMMCdriver,therangesareeightminornumbersperdevice:theminornumbersfrom0to7areforthefirstdevice,thenumbersfrom8to15areforthesecond,andsoon.Withineachrange,thefirstminornumberrepresentstheentiredeviceasrawsectors,andtheothersrepresentuptosevenpartitions.OneMMCchips,therearetwo128KiBareasofmemoryreservedforusebyabootloader.Thesearerepresentedasdevices:mmcblk1boot0andmmcblk1boot1,andtheyhaveminornumbers16and24.
Asanotherexample,youareprobablyawareoftheSCSIdiskdriver,knownas
sd,whichisusedtocontrolarangeofdisksthatusetheSCSIcommandset,whichincludesSCSI,SATA,USBmassstorage,anduniversalflashstorage(UFS).Ithasthemajornumber8andrangesof16minornumbersperinterface(ordisk).Theminornumbersfrom0to15areforthefirstinterfacewithdevicenodesnamedsdauptosda15,thenumbersfrom16to31arefortheseconddiskwithdevicenodessdbuptosdb15,andsoon.Thiscontinuesuptothe16diskfrom240to255withthenodenamesdp.ThereareothermajornumbersreservedforthembecauseSCSIdisksaresopopular,butweneedn'tworryaboutthathere.
BoththeMMCandSCSIblockdriversexpecttofindapartitontableatthestartofthedisk.Thepartitiontableiscreatedusingutilitiessuchasfdisk,sfidsk,orparted.
Auserspaceprogramcanopenandinteractwithablockdevicedirectlyviathedevicenode.Thisisnotacommonthingtodo,though,andisusuallyonlydonetoperformadministrativeoperationssuchascreatingpartitions,formattingapartitionwithafilesystem,andmounting.Oncethefilesystemismounted,youinteractwiththeblockdeviceindirectlythroughthefilesinthatfilesystem.
NetworkdevicesNetworkdevicesarenotaccessedthroughdevicenodes,andtheydonothavemajorandminornumbers.Instead,anetworkdeviceisallocatedanamebythekernel,basedonastringandaninstancenumber.Hereisanexampleofthewayanetworkdriverregistersaninterface:
my_netdev=alloc_netdev(0,"net%d",NET_NAME_UNKNOWN,netdev_setup);
ret=register_netdev(my_netdev);
Thiscreatesanetworkdevicenamednet0thefirsttimeitiscalled,net1thesecondtime,andsoon.Morecommonnamesarelo,eth0,andwlan0.Notethatthisisthenameitstartsoffwith;devicemanagers,suchasudev,maychangeittosomethingdifferentlateron.
Usually,thenetworkinterfacenameisonlyusedwhenconfiguringthenetworkusingutilities,suchasipandifconfig,toestablishanetworkaddressandroute.Thereafter,youinteractwiththenetworkdriverindirectlybyopeningsockets,andlettingthenetworklayerdecidehowtoroutethemtotherightinterface.
However,itispossibletoaccessnetworkdevicesdirectlyfromuserspacebycreatingasocketandusingtheioctlcommandslistedininclude/linux/sockios.h.Forexample,thisprogramusesSIOCGIFHWADDRtoquerythedriverforthehardware(MAC)address(thecodeisinMELP/chapter_09/show-mac-addresses):
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<unistd.h>
#include<sys/ioctl.h>
#include<linux/sockios.h>
#include<net/if.h>
intmain(intargc,char*argv[])
{
ints;
intret;
structifreqifr;
inti;
if(argc!=2){
printf("Usage%s[networkinterface]\n",argv[0]);
return1;
}
s=socket(PF_INET,SOCK_DGRAM,0);
if(s<0){
perror("socket");
return1;
}
strcpy(ifr.ifr_name,argv[1]);
ret=ioctl(s,SIOCGIFHWADDR,&ifr);
if(ret<0){
perror("ioctl");
return1;
}
for(i=0;i<6;i++)
printf("%02x:",(unsignedchar)ifr.ifr_hwaddr.sa_data[i]);
printf("\n");
close(s);
return0;
}
FindingoutaboutdriversatruntimeOnceyouhavearunningLinuxsystem,itisusefultoknowwhichdevicedriversareloadedandwhatstatetheyarein.Youcanfindoutalotbyreadingthefilesin/procand/sys.
Firstofall,youcanlistthecharacterandblockdevicedriverscurrentlyloadedandactivebyreading/proc/devices:
#cat/proc/devices
Characterdevices:
1mem
2pty
3ttyp
4/dev/vc/0
4tty
4ttyS
5/dev/tty
5/dev/console
5/dev/ptmx
7vcs
10misc
13input
29fb
81video4linux
89i2c
90mtd
116alsa
128ptm
136pts
153spi
180usb
189usb_device
204ttySC
204ttyAMA
207ttymxc
226drm
239ttyLP
240ttyTHS
241ttySiRF
242ttyPS
243ttyWMT
244ttyAS
245ttyO
246ttyMSM
247ttyAML
248bsg
249iio
250watchdog
251ptp
252pps
253media
254rtc
Blockdevices:
259blkext
7loop
8sd
11sr
31mtdblock
65sd
66sd
67sd
68sd
69sd
70sd
71sd
128sd
129sd
130sd
131sd
132sd
133sd
134sd
135sd
179mmc
Foreachdriver,youcanseethemajornumberandthebasename.However,thisdoesnottellyouhowmanydeviceseachdriverisattachedto.ItonlyshowsttyAMAbutgivesyounocluethatitisattachedtofourrealserialports.IwillcomebacktothatlaterwhenIlookatsysfs.
Ofcourse,networkdevicesdonotappearinthislist,becausetheydonothavedevicenodes.Instead,youcanusetoolssuchasifconfigoriptogetalistofnetworkdevices:
#iplinkshow
1:lo:<LOOPBACK,UP,LOWER_UP>mtu65536qdiscnoqueuestate
UNKNOWNmodeDEFAULT
link/loopback00:00:00:00:00:00brd00:00:00:00:00:00
2:eth0:<NO-CARRIER,BROADCAST,MULTICAST,UP>mtu1500qdisc
pfifo_faststateDOWNmodeDEFAULTqlen1000
link/ether54:4a:16:bb:b7:03brdff:ff:ff:ff:ff:ff
3:usb0:<BROADCAST,MULTICAST,UP,LOWER_UP>mtu1500qdisc
pfifo_faststateUPmodeDEFAULTqlen1000
link/etheraa:fb:7f:5e:a8:d5brdff:ff:ff:ff:ff:ff
YoucanalsofindoutaboutdevicesattachedtoUSBorPCIbusesusingthewell-knowncommands:lsusbandlspci.Thereisinformationaboutthemintherespectivemanualpagesandplentyofonlineguides,soIwillnotdescribethemanyfurtherhere.
Thereallyinterestinginformationisinsysfs,whichisthenexttopic.
GettinginformationfromsysfsYoucandefinesysfsinapedanticwayasarepresentationofkernelobjects,attributes,andrelationships.Akernelobjectisadirectory,anattributeisafile,andarelationshipisasymboliclinkfromoneobjecttoanother.Fromamorepracticalpointofview,sincetheLinuxdevicedrivermodelrepresentsalldevicesanddriversaskernelobjects,youcanseethekernel'sviewofthesystemlaidoutbeforeyoubylookingin/sys,asshownhere:
#ls/sys
blockclassdevicesfsmodule
busdevfirmwarekernelpower
Inthecontextofdiscoveringinformationaboutdevicesanddrivers,Iwilllookatthreeofthesedirectories:devices,class,andblock.
Thedevices:/sys/devicesThisisthekernel'sviewofthedevicesdiscoveredsincebootandhowtheyareconnectedtoeachother.Itisorganizedatthetoplevelbythesystembus,sowhatyouseevariesfromonesystemtoanother.ThisistheQEMUemulationoftheARMVersatile:
#ls/sys/devices
platformsoftwaresystemtracepointvirtual
Therearethreedirectoriesthatarepresentonallsystems:
system/:Thiscontainsdevicesattheheartofthesystem,includingCPUsandclocks.virtual/:Thiscontainsdevicesthatarememory-based.Youwillfindthememorydevicesthatappearas/dev/null,/dev/random,and/dev/zeroinvirtual/mem.Youwillfindtheloopbackdevice,lo,invirtual/net.platform/:Thisisacatch-allfordevicesthatarenotconnectedviaaconventionalhardwarebus.Thismaybealmosteverythingonanembeddeddevice.
Theotherdevicesappearindirectoriesthatcorrespondtoactualsystembuses.Forexample,thePCIrootbus,ifthereisone,appearsaspci0000:00.
Navigatingthishierarchyisquitehard,becauseitrequiressomeknowledgeofthetopologyofyoursystem,andthepath-namesbecomequitelongandhardtoremember.Tomakelifeeasier,/sys/classand/sys/blockoffertwodifferentviewsofthedevices.
Thedrivers:/sys/classThisisaviewofthedevicedriverspresentedbytheirtype.Inotherwords,itisasoftwareviewratherthanahardwareview.Eachofthesubdirectoriesrepresentsaclassofdriverandisimplementedbyacomponentofthedriverframework.Forexample,UARTdevicesaremanagedbythettylayer,andyouwillfindthemin/sys/class/tty.Likewise,youwillfindnetworkdevicesin/sys/class/net,inputdevicessuchasthekeyboard,thetouchscreen,andthemousein/sys/class/input,andsoon.
Thereisasymboliclinkineachsubdirectoryforeachinstanceofthattypeofdevicepointingtoitsrepresentationin/sys/device.
Totakeaconcreteexample,let'slookattheserialportsontheVersatilePB.Firstofall,wecanseethattherearefourofthem:
#ls-d/sys/class/tty/ttyAMA*
/sys/class/tty/ttyAMA0/sys/class/tty/ttyAMA2
/sys/class/tty/ttyAMA1/sys/class/tty/ttyAMA3
Eachdirectoryisarepresentationofthekernelobjectthatisassociatedwithaninstanceofadeviceinterface.Lookingwithinoneofthesedirectories,wecanseetheattributesoftheobject,representedasfiles,andtherelationshipswithotherobjects,representedbylinks:
#ls/sys/class/tty/ttyAMA0
close_delayflagslineuartclk
closing_waitio_typeportuevent
custom_divisoriomem_basepowerxmit_fifo_size
deviomem_reg_shiftsubsystem
deviceirqtype
Thelinkcalleddevicepointstothehardwareobjectforthedevice.Thelinknamedsubsystempointsbacktotheparentsubsystem,/sys/class/tty.Theremainingdirectoryentriesareattributes.Somearespecifictoaserialport,suchasxmit_fifo_size,andothersapplytomanytypesofdevicesuchastheinterruptnumber,irq,andthedevicenumber,dev.Someattributefilesarewritableandallowyoutotuneparametersinthedriveratruntime.
Thedevattributeisparticularlyinteresting.Ifyoulookatitsvalue,youwillfind
thefollowing:
#cat/sys/class/tty/ttyAMA0/dev
204:64
Thesearethemajorandminornumbersofthisdevice.Thisattributeiscreatedwhenthedriverregisteredthisinterface.Itisfromthisfilethatudevandmdevfindthemajorandminornumbersofthedevicedriver.
Theblockdrivers:/sys/blockThereisonemoreviewofthedevicemodelthatisimportanttothisdiscussion:theblockdriverviewthatyouwillfindin/sys/block.Thereisasubdirectoryforeachblockdevice.ThisexampleistakenfromaBeagleBoneBlack:
#ls/sys/block
loop0loop4mmcblk0ram0ram12ram2ram6
loop1loop5mmcblk1ram1ram13ram3ram7
loop2loop6mmcblk1boot0ram10ram14ram4ram8
loop3loop7mmcblk1boot1ram11ram15ram5ram9
Ifyoulookintommcblk1,whichistheeMMCchiponthisboard,youcanseetheattributesoftheinterfaceandthepartitionswithinit:
#ls/sys/block/mmcblk1
alignment_offsetext_rangemmcblk1p1ro
bdiforce_rommcblk1p2size
capabilityholderspowerslaves
devinflightqueuestat
devicemmcblk1boot0rangesubsystem
discard_alignmentmmcblk1boot1removableuevent
Theconclusion,then,isthatyoucanlearnalotaboutthedevices(thehardware)andthedrivers(thesoftware)thatarepresentonasystembyreadingsysfs.
FindingtherightdevicedriverAtypicalembeddedboardisbasedonareferencedesignfromthemanufacturerwithchangestomakeitsuitableforaparticularapplication.TheBSPthatcomeswiththereferenceboardshouldsupportalloftheperipheralsonthatboard.But,thenyoucustomizethedesign,perhapsbyaddingatemperaturesensorattachedviaI2C,somelightsandbuttonsconnectedviaGPIOpins,adisplaypanelviaaMIPIinterface,ormanyotherthings.Yourjobistocreateacustomkerneltocontrolallofthese,butwheredoyoustarttolookfordevicedriverstosupportalloftheseperipherals?
Themostobviousplacetolookisthedriversupportpageonthemanufacturer'swebsite,oryoucouldaskthemdirectly.Inmyexperience,thisseldomgetstheresultyouwant;hardwaremanufacturersarenotparticularlyLinux-savvy,andtheyoftengiveyoumisleadinginformation.Theymayhaveproprietarydriversasbinaryblobsortheymayhavesourcecodebutforadifferentversionofthekernelthantheoneyouhave.So,byallmeanstrythisroute.Personally,Iwillalwaystrytofindanopensourcedriverforthetaskinhand.
Theremaybesupportinyourkernelalready:therearemanythousandsofdriversinmainlineLinuxandtherearemanyvendor-specificdriversinthevendorkernels.Beginbyrunningmakemenuconfig(orxconfig)andsearchfortheproductnameornumber.Ifyoudonotfindanexactmatch,trymoregenericsearches,allowingforthefactthatmostdrivershandlearangeofproductsfromthesamefamily.Next,trysearchingthroughthecodeinthedriversdirectory(grepisyoufriendhere).
Ifyoustilldon'thaveadriver,youcantrysearchingonlineandaskingintherelevantforumstoseeifthereisadriverforalaterversionofLinux.Ifyoufindone,youshouldseriouslyconsiderupdatingtheBSPtousethelaterkernel.Sometimesthisisnotpractical,andsoitmayhavetothinkofbackportingthedrivertoyourkernel.Ifthekernelversionsaresimilar,itmaybeeasy,butiftheyaremorethan12to18monthsapart,thechancesarethatthecodewillhavechangedtotheextentthatyouwillhavetorewriteachunkofthedrivertointegrateitwithyourkernel.Ifalloftheabovefail,youwillhavetofinda
solutionyourselfbywritingthemissingkerneldriver.But,thisisnotalwaysnecessary,Iwillshowinthenextsection.
DevicedriversinuserspaceBeforeyoustartwritingadevicedriver,pauseforamomenttoconsiderwhetheritisreallynecessary.Therearegenericdevicedriversformanycommontypesofdevicethatallowyoutointeractwithhardwaredirectlyfromuserspacewithouthavingtowritealineofkernelcode.Userspacecodeiscertainlyeasiertowriteanddebug.ItisalsonotcoveredbytheGPL,althoughIdon'tfeelthatisagoodreasoninitselftodoitthisway.
Thesedriversfallintotwobroadcategories:thosethatyoucontrolthroughfilesinsysfs,includingGPIOandLEDs,andserialbusesthatexposeagenericinterfacethroughadevicenode,suchasI2C.
GPIOGeneral-PurposeInput/Output(GPIO)isthesimplestformofdigitalinterfacesinceitgivesyoudirectaccesstoindividualhardwarepins,eachofwhichcanbeinoneoftwostates:eitherhighorlow.InmostcasesyoucanconfiguretheGPIOpintobeeitheraninputoranoutput.YoucanevenuseagroupofGPIOpinstocreatehigherlevelinterfacessuchasI2CorSPIbymanipulatingeachbitinsoftware,atechniquethatiscalledbitbanging.ThemainlimitationisthespeedandaccuracyofthesoftwareloopsandthenumberofCPUcyclesyouwanttodedicatetothem.Generallyspeaking,itishardtoachievetimeraccuracybetterthanamillisecondunlessyouconfigureareal-timekernel,asweshallseeinChapter16,Real-TimeProgramming.MorecommonusecasesforGPIOareforreadingpushbuttonsanddigitalsensorsandcontrollingLEDs,motors,andrelays.
MostSoCshavealotofGPIObits,whicharegroupedtogetherinGPIOregisters,usually32bitsperregister.On-chipGPIObitsareroutedthroughtoGPIOpinsonthechippackageviaamultiplexer,knownasapinmux.TheremaybeadditionalGPIOpinsavailableoff-chipinthepowermanagementchip,andindedicatedGPIOextenders,connectedthroughI2CorSPIbuses.Allthisdiversityishandledbyakernelsubsystemknownasgpiolib,whichisnotactuallyalibrarybuttheinfrastructureGPIOdriversusetoexposeI/Oinaconsistentway.TherearedetailsabouttheimplementationofgpiolibinthekernelsourceinDocumentation/gpioandthecodeforthedriversthemselvesisindrivers/gpio.
Applicationscaninteractwithgpiolibthroughfilesinthe/sys/class/gpiodirectory.Hereisanexampleofwhatyouwillseeinthereonatypicalembeddedboard(aBeagleBoneBlack):
#ls/sys/class/gpio
exportgpiochip0gpiochip32gpiochip64gpiochip96unexport
Thedirectoriesnamedgpiochip0throughtogpiochip96representfourGPIOregisters,eachwith32GPIObits.Ifyoulookinoneofthegpiochipdirectories,youwillseethefollowing:
#ls/sys/class/gpio/gpiochip96
baselabelngpiopowersubsystemuevent
ThefilenamedbasecontainsthenumberofthefirstGPIOpinintheregisterandngpiocontainsthenumberofbitsintheregister.Inthiscase,gpiochip96/baseis96andgpiochip96/ngpiois32,whichtellsyouthatitcontainsGPIObits96to127.ItispossiblefortheretobeagapbetweenthelastGPIOinoneregisterandthefirstGPIOinthenext.
TocontrolaGPIObitfromuserspace,youfirsthavetoexportitfromkernelspace,whichyoudobywritingtheGPIOnumberto/sys/class/gpio/export.ThisexampleshowstheprocessforGPIO53,whichiswiredtouserLED0ontheBeagleBoneBlack:
#echo53>/sys/class/gpio/export
#ls/sys/class/gpio
exportgpio53gpiochip0gpiochip32gpiochip64gpiochip96unexport
Now,thereisanewdirectory,gpio53,whichcontainsthefilesyouneedtocontrolthepin.
IftheGPIObitisalreadyclaimedbythekernel,youwillnotbeabletoexportitinthisway.
Thedirectorygpio53containsthesefiles:
#ls/sys/class/gpio/gpio53
active_lowdirectionpoweruevent
deviceedgesubsystemvalue
Thepinbeginsasaninput.Tochangeittoanoutput,writeouttothedirectionfile.Thefilevaluecontainsthecurrentstateofthepin,whichis0forlowand1forhigh.Ifitisanoutput;youcanchangethestatebywriting0or1tovalue.Sometimes,themeaningoflowandhighisreversedinhardware(hardwareengineersenjoydoingthatsortofthing),sowriting1toactive_lowinvertsthemeaningofvaluesuchthatalowvoltageisreportedas1andahighvoltageas0.
YoucanremoveaGPIOfromuserspacecontrolbywritingtheGPIOnumberto/sys/class/gpio/unexport.
HandlinginterruptsfromGPIOInmanycases,aGPIOinputcanbeconfiguredtogenerateaninterruptwhenitchangesstate,whichallowsyoutowaitfortheinterruptratherthanpollinginaninefficientsoftwareloop.IftheGPIObitcangenerateinterrupts,thefilecallededgeexists.Initially,ithasthevaluecallednone,meaningthatitdoesnotgenerateinterrupts.Toenableinterrupts,youcansetittooneofthesevalues:
rising:Interruptonrisingedgefalling:Interruptonfallingedgeboth:Interruptonbothrisingandfallingedgesnone:Nointerrupts(default)
Youcanwaitforaninterruptusingthepoll()functionwithPOLLPRIastheevent.IfyouwanttowaitforarisingedgeonGPIO48,youfirstenabletheinterrupts:
#echo48>/sys/class/gpio/export
#echofalling>/sys/class/gpio/gpio48/edge
Then,youusepoll(2)towaitforthechange,asshowninthiscodeexample,whichyoucanseeinthebookcodearchiveinMELP/chapter_09/gpio-int/gpio-int.c:
#include<stdio.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/stat.h>
#include<fcntl.h>
#include<poll.h>
intmain(intargc,char*argv[])
{
intf;
structpollfdpoll_fds[1];
intret;
charvalue[4];
intn;
f=open("/sys/class/gpio/gpio48/value",O_RDONLY);
if(f==-1){
perror("Can'topengpio48");
return1;
}
n=read(f,&value,sizeof(value));
if(n>0){
printf("Initialvalue=%c\n",
value[0]);
lseek(f,0,SEEK_SET);
}
poll_fds[0].fd=f;
poll_fds[0].events=POLLPRI|POLLERR;
while(1){
printf("Waiting\n");
ret=poll(poll_fds,1,-1);
if(ret>0){
n=read(f,&value,sizeof(value));
printf("Buttonpressed:value=%c\n",
value[0]);
lseek(f,0,SEEK_SET);
}
}
return0;
}
LEDsLEDsareoftencontrolledthoughaGPIOpin,butthereisanotherkernelsubsystemthatoffersmorespecializedcontrolspecifictothepurpose.Theledskernelsubsystemaddstheabilitytosetbrightness,shouldtheLEDhavethatability,anditcanhandleLEDsconnectedinotherwaysthanasimpleGPIOpin.ItcanbeconfiguredtotriggertheLEDonaneventsuchasblockdeviceaccessorjustaheartbeattoshowthatthedeviceisworking.Youwillhavetoconfigureyourkernelwiththeoption,CONFIG_LEDS_CLASS,andwiththeLEDtriggeractionsthatareappropriatetoyou.ThereismoreinformationonDocumentation/leds/,andthedriversareindrivers/leds/.
AswithGPIOs,LEDsarecontrolledthroughaninterfaceinsysfsinthedirectory/sys/class/leds.InthecaseoftheBeagleBoneBlack,thenamesoftheLEDsareencodedinthedevicetreeintheformdevicename:colour:function,asshownhere:
#ls/sys/class/leds
beaglebone:green:heartbeatbeaglebone:green:usr2
beaglebone:green:mmc0beaglebone:green:usr3
Now,wecanlookattheattributesofoneoftheLEDs,notingthattheshellrequiresthatthecoloncharacters,':',inthepathnamehavetobeprecededbyabackslashescapecharacter,'\':
#cd/sys/class/leds/beaglebone\:green\:usr2
#ls
brightnessmax_brightnesssubsystemuevent
devicepowertrigger
ThebrightnessfilecontrolsthebrightnessoftheLEDandcanbeanumberbetween0(off)andmax_brightness(fullyon).IftheLEDdoesn'tsupportintermediatebrightness,anynon-zerovalueturnsiton.ThefilecalledtriggerliststheeventsthattriggertheLEDtoturnon.Thelistoftriggersisimplementationdependent.Hereisanexample:
#cattrigger
nonemmc0mmc1timeroneshotheartbeatbacklightgpio[cpu0]
default-on
Thetriggercurrentlyselectedisshowninsquarebrackets.Youcanchangeitby
writingoneoftheothertriggerstothefile.IfyouwanttocontroltheLEDentirelythroughbrightness,selectnone.Ifyousetthetriggertotimer,twoextrafilesappearthatallowyoutosettheonandofftimesinmilliseconds:
#echotimer>trigger
#ls
brightnessdelay_onmax_brightnesssubsystemuevent
delay_offdevicepowertrigger
#catdelay_on
500
#cat/sys/class/leds/beaglebone:green:heartbeat/delay_off
500
IftheLEDhason-chiptimerhardware,theblinkingtakesplacewithoutinterruptingtheCPU.
I2CI2Cisasimplelowspeed2-wirebusthatiscommononembeddedboards,typicallyusedtoaccessperipheralsthatarenotontheSoC,suchasdisplaycontrollers,camerasensors,GPIOextenders,andsoon.Thereisarelatedstandardknownassystemmanagementbus(SMBus)thatisfoundonPCs,whichisusedtoaccesstemperatureandvoltagesensors.SMBusisasubsetofI2C.
I2Cisamaster-slaveprotocolwiththemasterbeingoneormorehostcontrollersontheSoC.Slaveshavea7-bitaddressassignedbythemanufacturer(readthedatasheet),allowingupto128nodesperbus,but16arereserved,soonly112nodesareallowedinpractice.Themastermayinitiateareadorwritetransactionswithoneoftheslaves.Frequently,thefirstbyteisusedtospecifyaregisterontheslave,andtheremainingbytesarethedatareadfromorwrittentothatregister.
Thereisonedevicenodeforeachhostcontroller,forexample,thisSoChasfour:
#ls-l/dev/i2c*
crw-rw----1rooti2c89,0Jan100:18/dev/i2c-0
crw-rw----1rooti2c89,1Jan100:18/dev/i2c-1
crw-rw----1rooti2c89,2Jan100:18/dev/i2c-2
crw-rw----1rooti2c89,3Jan100:18/dev/i2c-3
ThedeviceinterfaceprovidesaseriesofioctlcommandsthatquerythehostcontrollerandsendthereadandwritecommandstoI2Cslaves.Thereisapackagenamedi2c-tools,whichusesthisinterfacetoprovidebasiccommand-linetoolstointeractwithI2Cdevices.Thetoolsareasfollows:
i2cdetect:ThisliststheI2Cadapters,andprobesthebusi2cdump:ThisdumpsdatafromalltheregistersofanI2Cperipherali2cget:ThisreadsdatafromanI2Cslavei2cset:ThiswritesdatatoanI2Cslave
Thei2c-toolspackageisavailableinBuildrootandtheYoctoProjectaswellasmostmainstreamdistributions.So,longasyouknowtheaddressandprotocolof
theslave,writingauserspaceprogramtotalktothedeviceisstraightforward.TheexamplethatfollowsshowshowtoreadthefirstfourbytesfromtheAT24C512BEEPROMthatismountedontheBeagleBoneBlackonI2Cbus0,slaveaddress0x50(thecodeisinMELP/chapter_09/i2c-example):
#include<stdio.h>
#include<unistd.h>
#include<fcntl.h>
#include<sys/ioctl.h>
#include<linux/i2c-dev.h>
#defineI2C_ADDRESS0x50
intmain(void)
{
intf;
intn;
charbuf[10];
f=open("/dev/i2c-0",O_RDWR);
/*Settheaddressofthei2cslavedevice*/
ioctl(f,I2C_SLAVE,I2C_ADDRESS);
/*Setthe16-bitaddresstoreadfromto0*/
buf[0]=0;/*addressbyte1*/
buf[1]=0;/*addressbyte2*/
n=write(f,buf,2);
/*Nowread4bytesfromthataddress*/
n=read(f,buf,4);
printf("0x%x0x%x00x%x0x%x\n",
buf[0],buf[1],buf[2],buf[3]);
close(f);
return0;
}
ThereismoreinformationabouttheLinuximplementationofI2CinDocumentation/i2c/dev-interface.Thehostcontrollerdriversareindrivers/i2c/busses.
SerialPeripheralInterface(SPI)TheSPIbusissimilartoI2C,butisalotfaster,uptotensofMHz.Theinterfaceusesfourwireswithseparatesendandreceivelines,whichallowittooperateinfullduplex.Eachchiponthebusisselectedwithadedicatedchipselectline.Itiscommonlyusedtoconnecttotouchscreensensors,displaycontrollers,andserialNORflashdevices.
AswithI2C,itisamaster-slaveprotocolwithmostSoCsimplementingoneormoremasterhostcontrollers.ThereisagenericSPIdevicedriver,whichyoucanenablethroughthekernelconfigurationCONFIG_SPI_SPIDEV.ItcreatesadevicenodeforeachSPIcontroller,whichallowsyoutoaccessSPIchipsfromuserspace.Thedevicenodesarenamedspidev[bus].[chipselect]:
#ls-l/dev/spi*
crw-rw----1rootroot153,0Jan100:29/dev/spidev1.0
Forexamplesofusingthespidevinterface,refertotheexamplecodeinDocumentation/spi.
WritingakerneldevicedriverEventually,whenyouhaveexhaustedalltheprevioususerspaceoptions,youwillfindyourselfhavingtowriteadevicedrivertoaccessapieceofhardwareattachedtoyourdevice.Characterdriversarethemostflexibleandshouldcover90%ofallyourneeds;networkdriversapplyifyouareworkingwithanetworkinterfaceandblockdriversareformassstorage.Thetaskofwritingakerneldriveriscomplexandbeyondthescopeofthisbook.Therearesomereferencesattheendthatwillhelpyouonyourway.Inthissection,Iwanttooutlinetheoptionsavailableforinteractingwithadriver—atopicnotnormallycovered—andshowyouthebarebonesofacharacterdevicedriver.
DesigningacharacterdriverinterfaceThemaincharacterdriverinterfaceisbasedonastreamofbytes,asyouwouldhavewithaserialport.However,manydevicesdon'tfitthisdescription:acontrollerforarobotarmneedsfunctionstomoveandrotateeachjoint,forexample.Luckily,thereareotherwaystocommunicatewithdevicedriversthanjustreadandwrite:
ioctl:Theioctlfunctionallowsyoutopasstwoargumentstoyourdriverwhichcanhaveanymeaningyoulike.Byconvention,thefirstargumentisacommand,whichselectsoneofseveralfunctionsinyourdriver,andthesecondisapointertoastructure,whichservesasacontainerfortheinputandoutputparameters.Thisisablankcanvasthatallowsyoutodesignanyprograminterfaceyoulike.Itisprettycommonwhenthedriverandapplicationarecloselylinkedandwrittenbythesameteam.However,ioctlisdeprecatedinthekernel,andyouwillfindithardtogetanydriverswithnewusesofioctlacceptedupstream.Thekernelmaintainersdislikeioctlbecauseitmakeskernelcodeandapplicationcodetoointerdependent,anditishardtokeepbothoftheminstepacrosskernelversionsandarchitectures.sysfs:Thisisthepreferredwaynow,agoodexamplebeingtheGPIOinterfacedescribedearlier.Theadvantagesarethatitissomewhatself-documenting,solongasyouchoosedescriptivenamesforthefiles.Itisalsoscriptablebecausethefilecontentsareusuallytextstrings.Ontheotherhand,therequirementforeachfiletocontainasinglevaluemakesithardtoachieveatomicityifyouneedtochangemorethanonevalueatatime.Conversely,ioctlpassesallitsargumentsinastructureinasinglefunctioncall.mmap:Youcangetdirectaccesstokernelbuffersandhardwareregistersbymappingkernelmemoryintouserspace,bypassingthekernel.YoumaystillneedsomekernelcodetohandleinterruptsandDMA.Thereisasubsystemthatencapsulatesthisidea,knownasuio,whichisshortforuserI/O.ThereismoredocumentationinDocumentation/DocBook/uio-howto,andthereareexampledriversindrivers/uio.sigio:Youcansendasignalfromadriverusingthekernelfunctionnamed
kill_fasync()tonotifyapplicationsofaneventsuchasinputbecomingreadyoraninterruptbeingreceived.Byconvention,thesignalcalledSIGIOisused,butitcouldbeany.YoucanseesomeexamplesintheUIOdriver,drivers/uio/uio.c,andintheRTCdriver,drivers/char/rtc.c.Themainproblemisthatitisdifficulttowritereliablesignalhandlersinuserspace,andsoitremainsalittle-usedfacility.debugfs:Thisisanotherpseudofilesystemthatrepresentskerneldataasfilesanddirectories,similartoprocandsysfs.Themaindistinctionisthatdebugfsmustnotcontaininformationthatisneededforthenormaloperationofthesystem;itisforthedebugandtraceinformationonly.Itismountedasmount-tdebugfsdebug/sys/kernel/debug.Thereisagooddescriptionofdebugfsinthekerneldocumentation,Documentation/filesystems/debugfs.txt.proc:Theprocfilesystemisdeprecatedforallnewcodeunlessitrelatestoprocesses,whichwastheoriginalintendedpurposeforthefilesystem.However,youcanuseproctopublishanyinformationyouchoose.And,unlikesysfsanddebugfs,itisavailabletonon-GPLmodules.netlink:Thisisasocketprotocolfamily.AF_NETLINKcreatesasocketthatlinkskernelspacetouserspace.ItwasoriginallycreatedsothatnetworktoolscouldcommunicatewiththeLinuxnetworkcodetoaccesstheroutingtablesandotherdetails.Itisalsousedbyudevtopasseventsfromthekerneltotheudevdaemon.Itisveryrarelyusedingeneraldevicedrivers.
Therearemanyexamplesofalloftheprecedingfilesysteminthekernelsourcecode,andyoucandesignreallyinterestinginterfacestoyourdrivercode.Theonlyuniversalruleistheprincipleofleastastonishment.Inotherwords,applicationwritersusingyourdrivershouldfindthateverythingworksinalogicalwaywithoutanyquirksoroddities.
TheanatomyofadevicedriverIt'stimetodrawsomethreadstogetherbylookingatthecodeforasimpledevicedriver.Hereisadevicedrivernameddummy,whichcreatesfourdevicesthatareaccessedthroughdev/dummy0to/dev/dummy3.Thecompletesourcecodeforthedriverfollows:youwillfindthecodeinMELP/chapter_09/dummy-driver:
#include<linux/kernel.h>
#include<linux/module.h>
#include<linux/init.h>
#include<linux/fs.h>
#include<linux/device.h>
#defineDEVICE_NAME"dummy"
#defineMAJOR_NUM42
#defineNUM_DEVICES4
staticstructclass*dummy_class;
staticintdummy_open(structinode*inode,structfile*file)
{
pr_info("%s\n",__func__);
return0;
}
staticintdummy_release(structinode*inode,structfile*file)
{
pr_info("%s\n",__func__);
return0;
}
staticssize_tdummy_read(structfile*file,
char*buffer,size_tlength,loff_t*offset)
{
pr_info("%s%u\n",__func__,length);
return0;
}
staticssize_tdummy_write(structfile*file,
constchar*buffer,size_tlength,loff_t*offset)
{
pr_info("%s%u\n",__func__,length);
returnlength;
}
structfile_operationsdummy_fops={
.owner=THIS_MODULE,
.open=dummy_open,
.release=dummy_release,
.read=dummy_read,
.write=dummy_write,
};
int__initdummy_init(void)
{
intret;
inti;
printk("Dummyloaded\n");
ret=register_chrdev(MAJOR_NUM,DEVICE_NAME,&dummy_fops);
if(ret!=0)
returnret;
dummy_class=class_create(THIS_MODULE,DEVICE_NAME);
for(i=0;i<NUM_DEVICES;i++){
device_create(dummy_class,NULL,
MKDEV(MAJOR_NUM,i),NULL,"dummy%d",i);
}
return0;
}
void__exitdummy_exit(void)
{
inti;
for(i=0;i<NUM_DEVICES;i++){
device_destroy(dummy_class,MKDEV(MAJOR_NUM,i));
}
class_destroy(dummy_class);
unregister_chrdev(MAJOR_NUM,DEVICE_NAME);
printk("Dummyunloaded\n");
}
module_init(dummy_init);
module_exit(dummy_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("ChrisSimmonds");
MODULE_DESCRIPTION("Adummydriver");
Attheendofthecode,themacroscalledmodule_initandmodule_exitspecifythefunctionstobecalledwhenthemoduleisloadedandunloaded.ThethreemacrosnamedMODULE_*addsomebasicinformationaboutthemodule,whichcanberetrievedfromthecompiledkernelmoduleusingthemodinfocommand.
Whenthemoduleisloaded,thedummy_init()functioniscalled.Youcanseethepointatwhichitbecomesacharacterdevicewhenismakesthecalltoregister_chrdev,passingapointertostructfile_operations,whichcontainspointerstothefourfunctionsthatthedriverimplements.Whileregister_chrdevtellsthekernelthatthereisadriverwithamajornumberof42,itdoesn'tsayanythingabouttheclassofdriver,andsoitwillnotcreateanentryin/sys/class.Withoutanentryin/sys/class,thedevicemanagercannotcreatedevicenodes.So,thenextfewlinesofcodecreateadeviceclass,dummyandfourdevicesofthatclasscalleddummy0todummy3.Theresultisthatthe/sys/class/dummydirectoryiscreatedwhenthedriverisinitialized,containingsubdirectoriesdummy0todummy3.Eachofthesubdirectoriescontainsafile,dev,withthemajorandminornumbersofthedevice.Thisisallthatadevicemanagerneedstocreatedevicenodes:/dev/dummy0to/dev/dummy3.
Thedummy_exitfunctionhastoreleasetheresourcesclaimedbydummy_init,which
heremeansfreeingupthedeviceclassandmajornumber.
Thefileoperationsforthisdriverareimplementedbydummy_open(),dummy_read(),dummy_write(),anddummy_release()andarecalledwhenauserspaceprogramcallsopen(2),read(2),write(2),andclose(2).Theyjustprintakernelmessagesothatyoucanseethattheywerecalled.Youcandemonstratethisfromthecommandlineusingtheechocommand:
#echohello>/dev/dummy0
dummy_open
dummy_write6
dummy_release
Inthiscase,themessagesappearbecauseIwasloggedontotheconsole,andkernelmessagesareprintedtotheconsolebydefault.Ifyouarenotloggedontotheconsole,youcanstillseethekernelmessagesusingthecommanddmesg.
Thefullsourcecodeforthisdriverislessthan100lines,butitisenoughtoillustratehowthelinkagebetweenadevicenodeanddrivercodeworks,howthedeviceclassiscreated,allowingadevicemanagertocreatedevicenodesautomaticallywhenthedriverisloaded,andhowthedataismovedbetweenuserandkernelspaces.Next,youneedtobuildit.
CompilingkernelmodulesAtthispoint,youhavesomedrivercodethatyouwanttocompileandtestonyourtargetsystem.Youcancopyitintothekernelsourcetreeandmodifymakefilestobuildit,oryoucancompileitasamoduleoutoftree.Let'sstartbybuildingoutoftree.
Youneedasimplemakefilewhichusesthekernelbuildsystemtodothehardwork:
LINUXDIR:=$(HOME)/MELP/build/linux
obj-m:=dummy.o
all:
makeARCH=armCROSS_COMPILE=arm-cortex_a8-linux-gnueabihf-\
-C$(LINUXDIR)M=$(shellpwd)
clean:
make-C$(LINUXDIR)M=$(shellpwd)clean
SetLINUXDIRtothedirectoryofthekernelforyourtargetdevicethatyouwillberunningthemoduleon.Theobj-m:=dummy.ocodewillinvokethekernelbuildruletotakethesourcefile,dummy.c,andcreatekernelmodule,dummy.ko.Iwillshowyouhowtoloadkernelmodulesinthenextsection.
Kernelmodulesarenotbinarycompatiblebetweenkernelreleasesandconfigurations:themodulewillonlyloadonthekernelitwascompiledwith.
Ifyouwanttobuildadriverinthekernelsourcetree,theprocedureisquitesimple.Chooseadirectoryappropriatetothetypeofdriveryouhave.Thedriverisabasiccharacterdevice,soIwouldputdummy.cindrivers/char.Then,editthemakefileinthedirectory,andaddalinetobuildthedriverunconditionallyasamodule,asfollows:
obj-m+=dummy.o
Oraddthefollowinglinetobuilditunconditionallyasabuilt-in:
obj-y+=dummy.o
Ifyouwanttomakethedriveroptional,youcanaddamenuoptiontotheKconfigfileandmakethecompilationconditionalontheconfigurationoption,asIdescribedinChapter4,ConfiguringandBuildingtheKernel,inthesection,Understandingkernelconfiguration.
LoadingkernelmodulesYoucanload,unload,andlistmodulesusingthesimpleinsmod,lsmod,andrmmodcommands.Heretheyareshownloadingthedummydriver:
#insmod/lib/modules/4.8.12-yocto-standard/kernel/drivers/dummy.ko
#lsmod
Tainted:G
dummy20620-Live0xbf004000(O)
#rmmoddummy
Ifthemoduleisplacedinasubdirectoryin/lib/modules/<kernelrelease>,youcancreateamodulesdependencydatabaseusingthecommand,depmod-a:
#depmod-a
#ls/lib/modules/4.8.12-yocto-standard
kernelmodules.aliasmodules.depmodules.symbols
Theinformationinthemodule.*filesisusedbythemodprobecommandtolocateamodulebynameratherthanthefullpath.modprobehasmanyotherfeatures,whicharedescribedonthemanualpagemodprobe(8).
DiscoveringthehardwareconfigurationThedummydriverdemonstratesthestructureofadevicedriver,butitlacksinteractionwithrealhardwaresinceitonlymanipulatesmemorystructures.Devicedriversareusuallywrittentointeractwithhardware.Partofthatisbeingabletodiscoverthehardwareinthefirstplace,bearinginmindthatitmaybeatdifferentaddressesindifferentconfigurations.
Insomecases,thehardwareprovidestheinformationitself.DevicesonadiscoverablebussuchasPCIorUSBhaveaquerymode,whichreturnsresourcerequirementsandauniqueidentifier.Thekernelmatchestheidentifierandpossiblyothercharacteristicswiththedevicedrivers,andmarriesthemup.
However,mostofthehardwareblocksonanembeddedboarddonothavesuchidentifiers.YouhavetoprovidetheinformationyourselfintheformofadevicetreeorasCstructuresknownasplatformdata.
InthestandarddrivermodelforLinux,devicedriversregisterthemselveswiththeappropriatesubsystem:PCI,USB,openfirmware(devicetree),platformdevice,andsoon.TheregistrationincludesanidentifierandacallbackfunctioncalledaprobefunctionthatiscalledifthereisamatchbetweentheIDofthehardwareandtheIDofthedriver.ForPCIandUSB,theIDisbasedonthevendorandtheproductIDsofthedevices;fordevicetreeandplatformdevices,itisaname(antextstring).
DevicetreesIgaveyouanintroductiontodevicetreesinChapter3,AllAboutBootloaders.Here,IwanttoshowyouhowtheLinuxdevicedrivershookupwiththisinformation.
Asanexample,IwillusetheARMVersatileboard,arch/arm/boot/dts/versatile-ab.dts,forwhichtheEthernetadapterisdefinedhere:
net@10010000{
compatible="smsc,lan91c111";
reg=<0x100100000x10000>;
interrupts=<25>;
};
TheplatformdataIntheabsenceofdevicetreesupport,thereisafallbackmethodofdescribinghardwareusingCstructures,knownastheplatformdata.
Eachpieceofhardwareisdescribedbystructplatform_device,whichhasanameandapointertoanarrayofresources.Thetypeoftheresourceisdeterminedbyflags,whichincludethefollowing:
IORESOURCE_MEM:ThisisthephysicaladdressofaregionofmemoryIORESOURCE_IO:ThisisthephysicaladdressorportnumberofIOregistersIORESOURCE_IRQ:Thisistheinterruptnumber
HereisanexampleoftheplatformdataforanEthernetcontrollertakenfromarch/arm/mach-versatile/core.c,whichhasbeeneditedforclarity:
#defineVERSATILE_ETH_BASE0x10010000
#defineIRQ_ETH25
staticstructresourcesmc91x_resources[]={
[0]={
.start=VERSATILE_ETH_BASE,
.end=VERSATILE_ETH_BASE+SZ_64K-1,
.flags=IORESOURCE_MEM,
},
[1]={
.start=IRQ_ETH,
.end=IRQ_ETH,
.flags=IORESOURCE_IRQ,
},
};
staticstructplatform_devicesmc91x_device={
.name="smc91x",
.id=0,
.num_resources=ARRAY_SIZE(smc91x_resources),
.resource=smc91x_resources,
};
Ithasamemoryareaof64KBandaninterrupt.Theplatformdatahastoberegisteredwiththekernel,usuallywhentheboardisinitialized:
void__initversatile_init(void)
{
platform_device_register(&versatile_flash_device);
platform_device_register(&versatile_i2c_device);
platform_device_register(&smc91x_device);
[...]
LinkinghardwarewithdevicedriversYouhaveseenintheprecedingsectionhowanEthernetadapterisdescribedusingadevicetreeandusingplatformdata.Thecorrespondingdrivercodeisindrivers/net/ethernet/smsc/smc91x.c,anditworkswithboththedevicetreeandplatformdata.Hereistheinitializationcode,onceagaineditedforclarity:
staticconststructof_device_idsmc91x_match[]={
{.compatible="smsc,lan91c94",},
{.compatible="smsc,lan91c111",},
{},
};
MODULE_DEVICE_TABLE(of,smc91x_match);
staticstructplatform_driversmc_driver={
.probe=smc_drv_probe,
.remove=smc_drv_remove,
.driver={
.name="smc91x",
.of_match_table=of_match_ptr(smc91x_match),
},
};
staticint__initsmc_driver_init(void)
{
returnplatform_driver_register(&smc_driver);
}
staticvoid__exitsmc_driver_exit(void)
{
platform_driver_unregister(&smc_driver);
}
module_init(smc_driver_init);
module_exit(smc_driver_exit);
Whenthedriverisinitialized,itcallsplatform_driver_register(),pointingtostructplatform_driver,inwhichthereisacallbacktoaprobefunction,adrivername,smc91x,andapointertostructof_device_id.
Ifthisdriverhasbeenconfiguredbythedevicetree,thekernelwilllookforamatchbetweenthecompatiblepropertyinthedevicetreenodeandthestringpointedtobythecompatiblestructureelement.Foreachmatch,itcallstheprobefunction.
Ontheotherhand,ifitwasconfiguredthroughplatformdata,theprobefunctionwillbecalledforeachmatchonthestringpointedtobydriver.name.
Theprobefunctionextractsinformationabouttheinterface:
staticintsmc_drv_probe(structplatform_device*pdev)
{
structsmc91x_platdata*pd=dev_get_platdata(&pdev->dev);
conststructof_device_id*match=NULL;
structresource*res,*ires;
intirq;
res=platform_get_resource(pdev,IORESOURCE_MEM,0);
ires=platform_get_resource(pdev,IORESOURCE_IRQ,0);
[...]
addr=ioremap(res->start,SMC_IO_EXTENT);
irq=ires->start;
[...]
}
Thecallstoplatform_get_resource()extractthememoryandirqinformationfromeitherthedevicetreeortheplatformdata.Itisuptothedrivertomapthememoryandinstalltheinterrupthandler.Thethirdparameter,whichiszeroinbothofthepreviouscases,comesintoplayifthereismorethanoneresourceofthatparticulartype.
Devicetreesallowyoutoconfiguremorethanjustbasicmemoryrangesandinterrupts.Thereisasectionofcodeintheprobefunctionthatextractsoptionalparametersfromthedevicetree.Inthissnippet,itgetstheregister-io-widthproperty:
match=of_match_device(of_match_ptr(smc91x_match),&pdev->dev);
if(match){
structdevice_node*np=pdev->dev.of_node;
u32val;
[...]
of_property_read_u32(np,"reg-io-width",&val);
[...]
}
Formostdrivers,specificbindingsaredocumentedinDocumentation/devicetree/bindings.Forthisparticulardriver,theinformationisinDocumentation/devicetree/bindings/net/smsc911x.txt.
Themainthingtorememberhereisthatdriversshouldregisteraprobefunctionandenoughinformationforthekerneltocalltheprobe,asitfindsmatcheswiththehardwareitknowsabout.Thelinkagebetweenthehardwaredescribedbythedevicetreeandthedevicedriveristhroughthecompatibleproperty.Thelinkagebetweenplatformdataandadriveristhroughthename.
AdditionalreadingThefollowingresourceshavefurtherinformationaboutthetopicsintroducedinthischapter:
LinuxKernelDevelopment,3rdeditionbyRobertLove.LinuxWeeklyNews,https://lwn.net/,especiallythekernelnewssection.EssentialLinuxDeviceDrivers,1stedition(27Mar.2008),bySreekrishnanVenkateswaran
SummaryDevicedrivershavethejobofhandlingdevices,usuallyphysicalhardwarebutsometimesvirtualinterfaces,andpresentingthemtouserspaceinaconsistentandusefulway.Linuxdevicedriversfallintothreebroadcategories:character,block,andnetwork.Ofthethree,thecharacterdriverinterfaceisthemostflexibleandtherefore,themostcommon.Linuxdriversfitintoaframeworkknownasthedrivermodel,whichisexposedthroughsysfs.Prettymuchtheentirestateofthedevicesanddriversisvisiblein/sys.
Eachembeddedsystemhasitsownuniquesetofhardwareinterfacesandrequirements.Linuxprovidesdriversformoststandardinterfaces,andbyselectingtherightkernelconfiguration,youcangetaworkingtargetboardveryquickly.Thisleavesyouwiththenon-standardcomponentsforwhichyouwillhavetoaddyourowndevicesupport.
Insomecases,youcansidesteptheissuebyusinggenericdriversforGPIO,I2C,andsoon,andwriteuserspacecodetodothework.Irecommendthisasastartingpoint,asitgivesyouthechancetogetfamiliarwiththehardwarewithoutwritingkernelcode.Writingkerneldriversisnotparticularlydifficult,butifyoudoyouneedtocodecarefullysoasnottocompromisethestabilityofthesystem.
Ihavetalkedaboutwritingthekerneldrivercode:ifyougodownthisroute,youwillinevitablywanttoknowhowtocheckwhetherornotitisworkingcorrectlyanddetectanybugs.IwillcoverthattopicinChapter14,DebuggingwithGDB.
Thenextchapterisallaboutuserspaceinitializationandthedifferentoptionsyouhavefortheinitprogram,fromthesimpleBusyBoxtothecomplexsystems.
StartingUp–TheinitProgramWelookedathowthekernelbootsuptothepointthatitlaunchesthefirstprogram,init,inChapter4,ConfiguringandBuildingtheKernel.InChapter5,BuildingaRootFilesystem,andChapter6,SelectingaBuildSystem,welookedatcreatingrootfilesystemsofvaryingcomplexity,allofwhichcontainedaninitprogram.Now,itistimetolookattheinitprograminmoredetailanddiscoverwhyitissoimportanttotherestofthesystem.
Therearemanypossibleimplementationsofinit.Iwilldescribethethreemainonesinthischapter:BusyBoxinit,SystemVinit,andsystemd.Foreachone,Iwillgiveanoverviewofhowitworksandthetypesofsystemitsuitsbest.Partofthisisbalancingthetradeoffbetweencomplexityandflexibility.
Inthischapterwewillcoverthefollowingtopics:
AfterthekernelhasbootedIntroducingtheinitprogramsSystemVinitThesystemd
AfterthekernelhasbootedWesawinChapter4,ConfiguringandBuildingtheKernel,howthekernelbootstrapcodeseekstofindarootfilesystem,eitherinitramfsorafilesystemspecifiedbyroot=onthekernelcommandline,andthentoexecuteaprogramwhich,bydefault,is/initforinitramfsand/sbin/initforaregularfilesystem.Theinitprogramhasrootprivilege,andsinceitisthefirstprocesstorun,ithasaprocessID(PID)of1.If,forsomereason,initcannotbestarted,thekernelwillpanic.
Theinitprogramistheancestorofallotherprocesses,asshownherebythepstreecommandrunningonasimpleembeddedLinuxsystem:
#pstree-gn
init(1)-+-syslogd(63)
|-klogd(66)
|-dropbear(99)
`-sh(100)---pstree(109)
Thejoboftheinitprogramistotakecontrolofthesystemandsetitrunning.Itmaybeassimpleasashellcommandrunningashellscript—thereisanexampleatthestartofChapter5,BuildingaRootFilesystem—but,inthemajorityofcases,youwillbeusingadedicatedinitdaemon.Thetasksithastoperformareasfollows:
Atboot,itstartsdaemonprogramsandconfiguressystemparametersandtheotherthingsneededtogetthesystemintoaworkingstate.Optionally,itlaunchesalogindaemon,suchasgetty,onTerminalsthatallowaloginshell.Itadoptsprocessesthatbecomeorphanedasaresultoftheirimmediateparentterminatingandtherebeingnootherprocessesinthethreadgroup.Itrespondstoanyoftheinit'simmediatechildrenterminatingbycatchingthesignalSIGCHLDandcollectingthereturnvaluetopreventthembecomingzombieprocesses.IwilltalkmoreaboutzombiesinChapter12,LearningAboutProcessesandThreads.Optionally,itrestartsthosedaemonsthathaveterminated.Ithandlesthesystemshutdown.
Inotherwords,initmanagesthelifecycleofthesystemfrombootuptoshutdown.Thecurrentthinkingisthatinitiswellplacedtohandleotherruntimeevents,suchasanewhardwareandtheloadingandunloadingofmodules.Thisiswhatsystemddoes.
IntroducingtheinitprogramsThethreeinitprogramsthatyouaremostlikelytoencounterinembeddeddevicesareBusyBoxinit,SystemVinit,andsystemd.BuildroothasoptionstobuildallthreewiththeinitBusyBoxasthedefault.TheYoctoProjectallowsyoutochoosebetweentheSystemVcalledinitandsystemdwithSystemVinitasthedefault.
Thefollowingtablegivessomemetricstocomparethethree:
Metric BusyBoxinit SystemVinit systemd
Complexity Low Medium High
Boot-upspeed Fast Slow Medium
Requiredshell ash ashorbash None
Numberofexecutables 0 4 50(*)
libc Any Any glibc
Size(MiB) 0 0.1 34(*)
(*)BasedontheBuildrootconfigurationofsystemd.
Broadlyspeaking,thereisanincreaseinflexibilityandcomplexityasyougo
fromBusyBoxinittosystemd.
BusyBoxinitBusyBoxhasaminimalinitprogramthatusesaconfigurationfile,/etc/inittab,todefinerulestostartprogramsatbootupandtostopthematshutdown.Usually,theactualworkisdonebyshellscripts,which,byconvention,areplacedinthe/etc/init.ddirectory.
initbeginsbyreading/etc/inittab.Thiscontainsalistofprogramstorun,oneperline,withthisformat:
<id>::<action>:<program>
Theroleoftheseparameterisasfollows:
id:ThisisthecontrollingTerminalforthecommandaction:Thisistheconditionstorunthiscommand,asshowninthefollowingparagraphprogram:Thisistheprogramtorun
Theactionsareasfollows:
sysinit:Runstheprogramwheninitstartsbeforeanyoftheothertypesofactions.respawn:Runstheprogramandrestartsitifitterminates.Itistypicallyusedtorunaprogramasadaemon.askfirst:Thisisthesameasrespawn,butitprintsthemessagePleasepressEntertoactivatethisconsoletotheconsole,anditrunstheprogramafterEnterhasbeenpressed.ItisusedtostartaninteractiveshellonaTerminalwithoutpromptingforausernameorpassword.once:Runstheprogramoncebutdoesnotattempttorestartitifitterminates.wait:Runstheprogramandwaitsforittocomplete.restart:RunstheprogramwheninitreceivesthesignalSIGHUP,indicatingthatitshouldreloadtheinittabfile.ctrlaltdel:Runstheprogramwheninitreceivesthesignal,SIGINT,usuallyasaresultofpressingCtrl+Alt+Delontheconsole.
shutdown:Runstheprogramwheninitshutsdown.
Hereisasmallexamplethatmountsprocandsysfsandrunsashellonaserialinterface:
null::sysinit:/bin/mount-tprocproc/proc
null::sysinit:/bin/mount-tsysfssysfs/sys
console::askfirst:-/bin/sh
Forsimpleprojectsinwhichyouwanttolaunchasmallnumberofdaemons,andperhapsstartaloginshellonaserialTerminal,itiseasytowritethescriptsmanually.Thiswouldbeappropriateifyouarecreatingarollyourown(RYO)embeddedLinux.However,youwillfindthathand-writteninitscriptsrapidlybecomeunmaintainableasthenumberofthingstobeconfiguredincreases.Theytendnottobeverymodularandsoneedupdatingeachtimeanewcomponentisadded.
BuildrootinitscriptsBuildroothasbeenmakingeffectiveuseoftheBusyBoxinitformanyyears.Buildroothastwoscriptsin/etc/init.d/namedrcSandrcK.Thefirstonerunsatbootupanditeratesoverallthescriptsin/etc/init.d/withnamesthatbeginwithacapitalSfollowedbytwodigits,andrunstheminnumericalorder.Thesearethestartscripts.ThercKscriptisrunatshutdownanditeratesoverallthescriptsbeginningwithacapitalKfollowedbytwodigits,andrunstheminnumericalorder.Thesearethekillscripts.
Withthisinplace,itbecomeseasyforBuildrootpackagestosupplytheirownstartandkillscripts,usingthetwodigitnumbertoimposetheorderinwhichtheyshouldberun,andsothesystembecomesextensible.IfyouareusingBuildroot,thisistransparent.Ifnot,youcoulduseitasamodelforwritingyourownBusyBoxinitscripts.
SystemVinitThisinitprogramwasinspiredbytheonefromUnixSystemVandsodatesbacktothemid1980s.TheversionmostoftenfoundinLinuxdistributionswaswritteninitiallybyMiquelvanSmoorenburg.Untilrecently,itwastheinitdaemonforalmostalldesktopandserverdistributionsandafairnumberofembeddedsystemsaswell.However,inrecentyears,ithasbeenreplacedbysystemd,whichIwilldescribeinthenextsection.
TheBusyBoxinitdaemonIhavejustdescribedisjustatrimmeddownversionofSystemVinit.ComparedtotheBusyBoxinit,SystemVinithastwoadvantages.Firstly,thebootscriptsarewritteninawell-known,modularformat,makingiteasytoaddnewpackagesatbuildtimeorruntime.Secondly,ithastheconceptofrunlevels,whichallowacollectionofprogramstobestartedorstoppedinonegowhenswitchingfromonerunleveltoanother.
Thereare8runlevelsnumberedfrom0to6,plusS:
S:Runsstartuptasks0:Haltsthesystem1to5:Availableforgeneraluse6:Rebootsthesystem
Levels1to5canbeusedasyouplease.OnthedesktopLinuxdistributions,theyareconventionallyassignedasfollows:
1:Singleuser2:Multi-userwithnonetworkconfiguration3:Multi-userwithnetworkconfiguration4:Notused5:Multi-userwithgraphicallogin
Theinitprogramstartsthedefaultrunlevelgivenbytheinitdefaultlinein/etc/inittab.Youcanchangetherunlevelatruntimeusingthecommandtelinit[runlevel],whichsendsamessagetoinit.Youcanfindthecurrentrunlevelandthepreviousoneusingtherunlevelcommand.Hereisanexample:
#runlevel
N5
#telinit3
INIT:Switchingtorunlevel:3
#runlevel
53
Initially,theoutputfromtherunlevelcommandisN5,indicatingthatthereisnopreviousrunlevel,becausetherunlevelhasnotchangedsincebootingandthecurrentrunlevelis5.Afterchangingtherunlevel,theoutputis53,showingthattherehasbeenatransitionfrom5to3.
Thehaltandrebootcommandsswitchtorunlevelscalled0and6respectively.Youcanoverridethedefaultrunlevelbygivingadifferentoneonthekernelcommandlineasasingledigitfrom0to6.Forexample,toforcetherunleveltobesingleuser,youwouldappend1tothekernelcommandline,anditwouldlooksomethinglikethis:
console=ttyAMA0root=/dev/mmcblk1p21
Eachrunlevelhasanumberofscriptsthatstopthings,calledkillscripts,andanothergroupthatstartthings,thestartscripts.Whenenteringanewrunlevel,initfirstrunsthekillscriptsinthenewlevel,andthenthestartscriptsinthenewlevel.DaemonsthatarecurrentlyrunningandwhichhaveneitherastartscriptnorakillscriptinthenewrunlevelaresentaSIGTERMsignal.Inotherwords,thedefaultactionontheswitchingrunlevelistoterminatedaemonsunlesstoldtodootherwise.
Intruth,runlevelsarenotusedthatmuchinembeddedLinux:mostdevicessimplyboottothedefaultrunlevelandstaythere.Ihaveafeelingthatitispartlybecausemostpeoplearenotawareofthem.
Runlevelsareasimpleandconvenientwaytoswitchbetweenmodes,forexample,fromproductiontomaintenancemode.
SystemVinitisanoptioninBuildrootandtheYoctoProject.Inbothcases,theinitscriptshavebeenstrippedofanybashshellspecifics,sotheywillworkwiththeBusyBoxashshell.However,BuildrootcheatssomewhatbyreplacingtheBusyBoxinitprogramwithSystemVinitandaddinginittabthatmimicsthebehaviorofBusyBox.Buildrootdoesnotimplementrunlevels,exceptthat
switchingtolevels0or6haltsorrebootsthesystem.
Next,let'slookatsomeofthedetails.ThefollowingexamplesaretakenfromtheMortyversionoftheYoctoProject.Otherdistributionsmayimplementtheinitscriptsalittledifferently.
inittabTheinitprogrambeginsbyreading/etc/inttab,whichcontainsentriesthatdefinewhathappensateachrunlevel.TheformatisanextendedversionofBusyBoxinittabthatIdescribedintheprecedingsection,whichisnotsurprisingbecauseBusyBoxborroweditfromSystemVinthefirstplace.
Theformatofeachlineininittabisasfollows:
id:runlevels:action:process
Thefieldsareshownhere:
id:Auniqueidentifierofuptofourcharacters.runlevels:Therunlevelsforwhichthisentryshouldbeexecuted.ThiswasleftblankintheBusyBoxinittabaction:Oneofthekeywordsgiveninthefollowingparagraph.process:Thecommandtorun.
TheactionsarethesameasforBusyBoxinit:sysinit,respawn,once,wait,restart,ctrlaltdel,andshutdown.However,SystemVinitdoesnothaveaskfirst,whichisspecifictoBusyBox.
Asanexample,thisisthecompleteinittabsuppliedbytheYoctoProjecttargetcore-image-minimalfortheqemuarmmachine:
#/etc/inittab:init(8)configuration.
#$Id:inittab,v1.912002/01/2513:35:21miquelsExp$
#Thedefaultrunlevel.
id:5:initdefault:
#Boot-timesystemconfiguration/initializationscript.
#Thisisrunfirstexceptwhenbootinginemergency(-b)mode.
si::sysinit:/etc/init.d/rcS
#Whattodoinsingle-usermode.
~~:S:wait:/sbin/sulogin
#/etc/init.dexecutestheSandKscriptsuponchange
#ofrunlevel.
#
#Runlevel0ishalt.
#Runlevel1issingle-user.
#Runlevels2-5aremulti-user.
#Runlevel6isreboot.
l0:0:wait:/etc/init.d/rc0
l1:1:wait:/etc/init.d/rc1
l2:2:wait:/etc/init.d/rc2
l3:3:wait:/etc/init.d/rc3
l4:4:wait:/etc/init.d/rc4
l5:5:wait:/etc/init.d/rc5
l6:6:wait:/etc/init.d/rc6
#Normallynotreached,butfallthroughincaseofemergency.
z6:6:respawn:/sbin/sulogin
AMA0:12345:respawn:/sbin/getty115200ttyAMA0
#/sbin/gettyinvocationsfortherunlevels.
#
#The"id"fieldMUSTbethesameasthelast
#charactersofthedevice(after"tty").
#
#Format:
#<id>:<runlevels>:<action>:<process>
#
1:2345:respawn:/sbin/getty38400tty1
Thefistentry,id:5:initdefault,setsthedefaultrunlevelto5.Thenextentry,si::sysinit:/etc/init.d/rcS,runsthescriptrcSatbootup.Therewillbemoreaboutthislater.Alittlefurtheron,thereisagroupofsixentriesbeginningwithl0:0:wait:/etc/init.d/rc0.Theyrunthe/etc/init.d/rcscripteachtimethereisachangeintherunlevel:thisscriptisresponsibleforprocessingthestartandkillscripts.
Towardtheendofinittab,thereisanentrythatrunsagettydaemontogeneratealoginprompton/dev/ttyAMA0whenenteringrunlevels1throughto5,therebyallowingyoutologonandgetaninteractiveshell:
AMA0:12345:respawn:/sbin/getty115200ttyAMA0
ThettyAMA0deviceistheserialconsoleontheARMVersatileboardweareemulatingwithQEMU;itwillbedifferentforotherdevelopmentboards.Thereisalsoanentrytorunagettyontty1,whichistriggeredwhenenteringrunlevels2through5.Thisisavirtualconsole,whichisoftenmappedtoagraphicalscreenifyouhavebuiltyourkernelwithCONFIG_FRAMEBUFFER_CONSOLEorVGA_CONSOLE.DesktopLinuxdistributionsusuallyspawnsixgettydaemonsonvirtualTerminals1to6,whichyoucanselectwiththekeycombinationCtrl+Alt+F1throughCTRL+Alt+F6,withvirtualTerminal7reservedforthegraphicalscreen.VirtualTerminalsareseldomusedonembeddeddevices.
The/etc/init.d/rcSscriptthatisrunbythesysinitentrydoeslittlemorethanenter
therunlevel,S:
#!/bin/sh
[...]
exec/etc/init.d/rcS
Hence,thefirstrunlevelenteredisS,followedbythedefaultrunlevelof5.NotethatrunlevelSisnotrecordedandisneverdisplayedasapriorrunlevelbytherunlevelcommand.
Theinit.dscriptsEachcomponentthatneedstorespondtoarunlevelchangehasascriptin/etc/init.dtoperformthechange.Thescriptshouldexpecttwoparameters:startandstop.Iwillgiveanexampleofthislater.
Therunlevelhandlingscript,/etc/init.d/rc,takestherunlevelitisswitchingtoasaparameter.Foreachrunlevel,thereisadirectorynamedrc<runlevel>.d:
#ls-d/etc/rc*
/etc/rc0.d/etc/rc2.d/etc/rc4.d/etc/rc6.d
/etc/rc1.d/etc/rc3.d/etc/rc5.d/etc/rcS.d
ThereyouwillfindasetofscriptsbeginningwithacapitalSfollowedbytwodigits,andyoumayalsofindscriptsbeginningwithacapitalK.Thesearethestartandkillscripts.Hereisanexampleofthescriptsforrunlevel5:
#ls/etc/rc5.d
S01networkingS20hwclock.shS99rmnologin.shS99stop-bootlogd
S15mountnfs.shS20syslog
Theseareinfactsymboliclinksbacktotheappropriatescriptininit.d.ThercscriptrunsallthescriptsbeginningwithaKfirst,addingthestopparameter,andthenrunsthosebeginningwithanSaddingthestartparameter.Onceagain,thetwodigitcodeistheretoimparttheorderinwhichthescriptsshouldrun.
AddinganewdaemonImaginethatyouhaveaprogramnamedsimpleserver,whichiswrittenasatraditionalUnixdaemon,inotherwords,itforksandrunsinthebackground:thecodeforsuchaprogramisinMELP/chapter_10/simpleserver.Youwillneedaninit.dscriptlikethis,whichyouwillfindinMELP/chapter_10/simpleserver-sysvinit:
#!/bin/sh
case"$1"in
start)
echo"Startingsimpelserver"
start-stop-daemon-S-nsimpleserver-a/usr/bin/simpleserver
;;
stop)
echo"Stoppingsimpleserver"
start-stop-daemon-K-nsimpleserver
;;
*)
echo"Usage:$0{start|stop}"
exit1
esac
exit0
start-stop-daemonisahelperfunctionthatmakesiteasiertomanipulatebackgroundprocessessuchasthis.ItoriginallycamefromtheDebianinstallerpackage,dpkg,butmostembeddedsystemsusetheonefromBusyBox.Itstartsthedaemonwiththe-Sparameter,makingsurethatthereisnevermorethanoneinstancerunningatanyonetime.Tostopadaemon,youusethe-Kparameter,whichcausesittosendasignal,SIGTERMbydefault,toindicatetothedaemonthatitistimetoterminate.
Tomakesimpleserveroperational,copythescripttothetargetdirectorycalled/etc/init.d/simpleserverandmakeitexecutable.Then,addlinksfromeachoftherunlevelsthatyouwanttorunthisprogramfrom;inthiscase,onlythedefaultrunlevel,5:
#cd/etc/init.d/rc5.d
#ln-s../init.d/simpleserverS99simpleserver
Thenumber99meansthatthiswillbeoneofthelastprogramstobestarted.BearinmindthattheremaybeotherlinksbeginningS99,inwhichcasethercscriptwilljustruntheminlexicalorder.
Itisrareinembeddeddevicestohavetoworrytoomuchaboutshutdownoperations,butifthereissomethingthatneedstobedone,addkilllinkstolevels0and6:
#cd/etc/init.d/rc0.d
#ln-s../init.d/simpleserverK01simpleserver
#cd/etc/init.d/rc6.d
#ln-s../init.d/simpleserverK01simpleserver
StartingandstoppingservicesYoucaninteractwiththescriptsin/etc/init.dbycallingthemdirectly.Hereisanexampleusingthesyslogscript,whichcontrolsthesyslogdandklogddaemons:
#/etc/init.d/syslog--help
Usage:syslog{start|stop|restart}
#/etc/init.d/syslogstop
Stoppingsyslogd/klogd:stoppedsyslogd(pid198)
stoppedklogd(pid201)
done
#/etc/init.d/syslogstart
Startingsyslogd/klogd:done
Allscriptsimplementstartandstop,andtheyshouldalsoimplementhelp.Someimplementstatusaswell,whichwilltellyouwhethertheserviceisrunningornot.MainstreamdistributionsthatstilluseSystemVinithaveacommandnamedservicetostartandstopservices,whichhidethedetailsofcallingthescriptsdirectly.
systemdsystemd,https://www.freedesktop.org/wiki/Software/systemd/,definesitselfasasystemandservicemanager.Theprojectwasinitiatedin2010byLennartPoetteringandKaySieverstocreateanintegratedsetoftoolsformanagingaLinuxsystembasedaroundaninitdaemon.Italsoincludesdevicemanagement(udev)andlogging,amongotherthings.systemdisstateoftheartandisstillevolvingrapidly.ItiscommonondesktopandserverLinuxdistributionsandisbecomingpopularonembeddedLinuxsystemstoo,especiallyonmorecomplexdevices.So,howisitbetterthanSystemVinitforembeddedsystems?
Theconfigurationissimplerandmorelogical(onceyouunderstandit).RatherthanthesometimesconvolutedshellscriptsofSystemVinit,systemdhasunitconfigurationfiles,whicharewritteninawell-definedformat.Thereareexplicitdependenciesbetweenservicesratherthanatwodigitcodethatmerelysetsthesequenceinwhichthescriptsarerun.Itiseasytosetthepermissionsandresourcelimitsforeachservice,whichisimportantforthesecurity.Itcanmonitorservicesandrestartthemifneeded.Therearewatchdogsforeachserviceandforsystemditself.Servicesarestartedinparallel,potentiallyreducingboottime.
Acompletedescriptionofsystemdisneitherpossiblenorappropriatehere.AswithSystemVinit,IwillfocusontheembeddedusecaseswithexamplesbasedontheconfigurationproducedbytheMortyreleaseoftheYoctoProject,whichhasthesystemdversion230.Iwillgiveaquickoverview,andthenshowyousomespecificexamples.
BuildingsystemdwiththeYoctoProjectandBuildrootThedefaultinitdaemonintheYoctoProjectisSystemV.Toselectsystemd,addtheselinestoyourconf/local.conf:
DISTRO_FEATURES_append="systemd"
VIRTUAL-RUNTIME_init_manager="systemd"
Ifyoubuildwithjustthosetwolinesinyourlocalconfiguration,youwillfindthatsomecomponentsofSystemVinitarestillpresent,forexample,theinitscriptsin/etc/init.dandcommandssuchasrunlevelandstart-stop-daemon.Ifyouwanttostripthoseoutaswell,addthistoyourlocalconfiguration:
DISTRO_FEATURES_BACKFILL_CONSIDERED="sysvinit"
VIRTUAL-RUNTIME_initscripts=""
BuildrootusesBusyBoxinitbydefault.YoucanselectsystemdthoughmenuconfigbylookinginthemenuSystemconfiguration->Initsystem.YouwillalsohavetoconfigurethetoolchaintouseglibcfortheC-library,sincesystemddoesnotsupportuClibc-ngormusllibc.Inaddition,therearerestrictionsontheversionandconfigurationofthekernel.ThereisacompletelistoflibraryandkerneldependenciesintheREADMEfileinthetoplevelofthesystemdsourcecode.
Introducingtargets,services,andunitsBeforeIdescribehowsystemdinitworks,Ineedtointroducethesethreekeyconcepts:
Unit,whichisaconfigurationfilethatdescribesatarget,aservice,andseveralotherthings.Unitsaretextfilesthatcontainpropertiesandvalues.Service,whichisadaemonthatcanbestartedandstopped,verymuchlikeaSystemVinitservice.Target,whichisagroupofservices,similarto,butmoregeneralthan,aSystemVinitrunlevel.Thereisadefaulttargetwhichisthegroupofservicesthatarestartedatboottime.
Youcanchangestatesandfindoutwhatisgoingonusingthesystemctlcommand.
UnitsThebasicitemofconfigurationistheunitfile.Unitfilesarefoundinthreedifferentplaces:
/etc/systemd/system:Localconfiguration/run/systemd/system:Runtimeconfiguration/lib/systemd/system:Distribution-wideconfiguration
Whenlookingforaunit,systemdsearchesthedirectoriesinthatorder,stoppingassoonasitfindsamatch,andallowingyoutooverridethebehaviorofadistribution-wideunitbyplacingaunitofthesamenamein/etc/systemd/system.Youcandisableaunitcompletelybycreatingalocalfilethatisemptyorlinkedto/dev/null.
Allunitfilesbeginwithasectionmarked[Unit],whichcontainsbasicinformationanddependencies.Asanexample,hereistheUnitsectionoftheD-Busservice,/lib/systemd/system/dbus.service:
[Unit]
Description=D-BusSystemMessageBus
Documentation=man:dbus-daemon(1)
Requires=dbus.socket
Inadditiontothedescriptionandareferencetothedocumentation,thereisadependencyonthedbus.socketunitexpressedthroughtheRequireskeyword.ThistellssystemdtocreatealocalsocketwhentheD-Busserviceisstarted.
DependenciesintheUnitsectionareexpressedthoughthekeywordsRequires,Wants,andConflicts:
Requires:Thisisalistofunitsthatthisunitdependson,whicharestartedwhenthisunitisstartedWants:ThisisaweakerformofRequires;theunitslistedarestartedbutthecurrentunitisnotstoppedifanyofthemfailConflicts:Thisisanegativedependency;theunitslistedarestoppedwhenthisoneisstartedand,conversely,ifoneofthemisstarted,thisoneisstopped
Thesethreekeywordsdefineoutgoingdependencies.Theyareusedmostlytocreatedependenciesbetweentargets.Thereisanothersortofdependencycalledanincomingdependency,whichisusedtocreatealinkbetweenaserviceandatarget.Inotherwords,outgoingdependenciesareusedtocreatethelistoftargetsthatneedtobestartedasthesystemgoesfromonestatetoanother,andincomingdependenciesareusedtodeterminetheservicesthatshouldbestartedorstoppedinanyparticularstate.IncomingdependenciesarecreatedbytheWantedBykeyword,whichIwilldescribeinthesectiononinstallingyourownservice.
Processingthedependenciesproducesalistofunitsthatshouldbestartedorstopped.ThekeywordsBeforeandAfterdeterminetheorderinwhichtheyarestarted.Theorderofstoppingisjustthereverseofthestartorder:
Before:ThisunitshouldbestartedbeforetheunitslistedAfter:Thisunitshouldbestartedaftertheunitslisted
Inthefollowingexample,theAfterdirectivemakessurethatthewebserverisstartedafterthenetwork:
[Unit]
Description=LighttpdWebServer
After=network.target
IntheabsenceoftheBeforeorAfterdirective,theunitswillbestartedorstoppedinparallelwithnoparticularordering.
ServicesAserviceisadaemonthatcanbestartedandstopped,equivalenttoaSystemVinitservice.Aserviceisatypeofunitfilewithanameendingin.service,forexample,lighttpd.service.
Aserviceunithasa[Service]sectionthatdescribeshowitshouldberun.Hereistherelevantsectionfromlighttpd.service:
[Service]
ExecStart=/usr/sbin/lighttpd-f/etc/lighttpd/lighttpd.conf-D
ExecReload=/bin/kill-HUP$MAINPID
Thesearethecommandstorunwhenstartingtheserviceandrestartingit.Therearemanymoreconfigurationpointsyoucanaddinhere,sorefertothemanualpageforsystemd.service(5).
TargetsAtargetisanothertypeofunit,whichgroupsservices(orothertypesofunit).Itisatypeofunitthatonlyhasdependencies.Targetshavenamesendingin.target,forexample,multi-user.target.Atargetisadesiredstate,whichperformsthesameroleasSystemVinitrunlevels.Forexample,thisisthecompleteunitformuti-user.target:
[Unit]
Description=Multi-UserSystem
Documentation=man:systemd.special(7)
Requires=basic.target
Conflicts=rescue.servicerescue.target
After=basic.targetrescue.servicerescue.target
AllowIsolate=yes
Thissaysthatthebasictargetmustbestartedbeforethemulti-usertarget.Italsosaysthatsinceitconflictswiththerescuetarget,startingtherescuetargetwillcausethemulti-usertargettobestoppedfirst.
HowsystemdbootsthesystemNow,wecanseehowsystemdimplementsthebootstrap.systemdisrunbythekernelasaresultof/sbin/initbeingsymbolicallylinkedto/lib/systemd/systemd.Itrunsthedefaulttarget,default.target,whichisalwaysalinktoadesiredtargetsuchasmulti-user.targetforatextloginorgraphical.targetforagraphicalenvironment.Forexample,ifthedefaulttargetismulti-user.target,youwillfindthissymboliclink:
/etc/systemd/system/default.target->
/lib/systemd/system/multi-user.target
Thedefaulttargetmaybeoverriddenbypassingsystem.unit=<newtarget>onthekernelcommandline.Youcanusesystemctltofindoutthedefaulttarget,asshownhere:
#systemctlget-default
multi-user.target
Startingatargetsuchasmulti-user.targetcreatesatreeofdependenciesthatbringthesystemintoaworkingstate.Inatypicalsystem,multi-user.targetdependsonbasic.target,whichdependsonsysinit.target,whichdependsontheservicesthatneedtobestartedearly.Youcanprintagraphusingsystemctllist-dependencies.
Youcanalsolistalltheservicesandtheircurrentstateusing:
#systemctllist-units--typeservice
Andthesamefortargetsusing:
#systemctllist-units--typetarget
AddingyourownserviceUsingthesamesimpleserverexampleasbefore,hereisaserviceunit,whichyouwillfindinMELP/chapter_10/simpleserver-systemd:
[Unit]
Description=Simpleserver
[Service]
Type=forking
ExecStart=/usr/bin/simpleserver
[Install]
WantedBy=multi-user.target
The[Unit]sectiononlycontainsadescriptionsothatitshowsupcorrectlywhenlistedusingsystemctlandothercommands.Therearenodependencies;asIsaid,itisverysimple.
The[Service]sectionpointstotheexecutableandhasaflagtoindicatethatitforks.Ifitwereevensimplerandranintheforeground,systemdwoulddothedaemonizingforusandType=forkingwouldnotbeneeded.
The[Install]sectioncreatesanincomingdependencyonmulti-user.targetsothatourserverisstartedwhenthesystemgoesintothemulti-usermode.
Oncetheunitissavedin/etc/systemd/system/simpleserver.service,youcanstartandstopitusingthecommands:systemctlstartsimpleserverandsytemctlstopsimpleserver.Youcanalsousesystemctltofinditscurrentstatus:
#systemctlstatussimpleserver
simpleserver.service-Simpleserver
Loaded:loaded(/etc/systemd/system/simpleserver.service;
disabled)Active:active(running)sinceThu1970-01-0102:20:50UTC;8s
ago
MainPID:180(simpleserver)
CGroup:/system.slice/simpleserver.service
└─180/usr/bin/simpleserver-n
Jan0102:20:50qemuarmsystemd[1]:StartedSimpleserver.
Atthispoint,itwillonlystartandstoponcommand,asshownhere.Tomakeitpersistent,youneedtoaddapermanentdependencytoatarget.Thisisthepurposeofthe[Install]sectionintheunit;itsaysthatwhenthisserviceis
enableditwillbecomedependentonmulti-user.target,andsowillbestartedatboottime.Youenableitusingsystemctlenable,likethis:
#systemctlenablesimpleserver
Createdsymlinkfrom/etc/systemd/system/multi-
user.target.wants/simpleserver.serviceto
/etc/systemd/system/simpleserver.service.
Now,youcanseehowservicesadddependencieswithouthavingtokeeponeditingtargetunitfiles.Atargetcanhaveadirectorynamed<target_name>.target.wants,whichcancontainlinkstoservices.Thisisexactlythesameasaddingthedependentunittothe[Wants]listinthetarget.Inthiscase,youwillfindthatthislinkhasbeencreated:
/etc/systemd/system/multi-user.target.wants/simpleserver.service->
/etc/systemd/system/simpleserver.service
Ifthisiswereanimportantserviceyoumightwanttorestartitifitfailed.Youcanaccomplishthatbyaddingthisflagtothe[Service]section:
Restart=on-abort
OtheroptionsforRestartareon-success,on-failure,on-abnormal,on-watchdog,on-abort,oralways.
AddingawatchdogWatchdogsareacommonrequirementinembeddeddevices:youneedtotakeactionifacriticalservicestopsworking,usuallybyresettingthesystem.OnmostembeddedSoCs,thereisahardwarewatchdog,whichcanbeaccessedviathe/dev/watchdogdevicenode.Thewatchdogisinitializedwithatimeoutatboot,andthenmustberesetwithinthatperiod,otherwisethewatchdogwillbetriggeredandthesystemwillreboot.TheinterfacewiththewatchdogdriverisdescribedinthekernelsourceinDocumentation/watchdogandthecodeforthedriversisindrivers/watchdog.
Aproblemarisesiftherearetwoormorecriticalservicesthatneedtobeprotectedbyawatchdog.systemdhasausefulfeaturethatdistributesthewatchdogbetweenmultipleservices.
systemdcanbeconfiguredtoexpectaregularkeepalivecallfromaserviceandtakeactionifitisnotreceived,creatingaper-servicesoftwarewatchdog.Forthistowork,youhavetoaddcodetothedaemontosendthekeepalivemessages.Itneedstocheckforanon-zerovalueintheWATCHDOG_USECenvironmentvariable,andthencallsd_notify(false,"WATCHDOG=1")withinthistime(aperiodofhalfofthewatchdogtimeoutisrecommended).Thereareexamplesinthesystemdsourcecode.
Toenablethewatchdogintheserviceunit,addsomethinglikethistothe[Service]section:
WatchdogSec=30s
Restart=on-watchdog
StartLimitInterval=5min
StartLimitBurst=4
StartLimitAction=reboot-force
Inthisexample,theserviceexpectskeepaliveevery30seconds.Ifitfailstobedelivered,theservicewillberestarted,butifitisrestartedmorethanfourtimesinfiveminutes,systemdwillforceanimmediatereboot.Onceagain,thereisafulldescriptionofthesesettingsinthesystemd.service(5)manualpage.
Awatchdoglikethistakescareofindividualservices,butwhatifsystemditself
fails,thekernelcrashes,orthehardwarelocksup.Inthosecases,weneedtotellsystemdtousethewatchdogdriver:justaddRuntimeWatchdogSec=NNto/etc/systemd/system.conf.systemdwillresetthewatchdogwithinthatperiod,andsothesystemwillresetifsystemdfailsforsomereason.
ImplicationsforembeddedLinuxSystemdhasalotoffeaturesthatareusefulinembeddedLinux,includingmanythatIhavenotmentionedinthisbriefdescription,suchasresourcecontrolusingslices(whicharedescribedinthemanualpagesforsystemd.slice(5)andsystemd.resource-control(5)),devicemanagement(udev(7)),andsystemloggingfacilities(journald(5)).
Youhavetobalancethatwithitssize:evenwithaminimalbuildofjustthecorecomponents,systemd,udevd,andjournald,itisapproaching10MiBofstorage,includingthesharedlibraries.
Youalsohavetokeepinmindthatsystemddevelopmentfollowsthekernelclosely,soitwillnotworkonakernelmorethanayearortwoolderthanthereleaseofsystemd.
FurtherreadingsystemdSystemandServiceManager:http://www.freedesktop.org/wiki/Software/systemd/.Therearealotofusefullinksatthebottomofthatpage.
SummaryEveryLinuxdeviceneedsaninitprogramofsomekind.Ifyouaredesigningasystem,whichonlyhastolaunchasmallnumberofdaemonsatstartupandremainsfairlystaticafterthat,thenBusyBoxinitissufficientforyourneeds.ItisusuallyagoodchoiceifyouareusingBuildrootasthebuildsystem.
If,ontheotherhand,youhaveasystemthathascomplexdependenciesbetweenservicesatboottimeorruntime,andyouhavethestoragespace,thensystemdwouldbethebestchoice.Evenwithoutthecomplexity,systemdhassomeusefulfeaturesinthewayithandleswatchdogs,remotelogging,andsoon,soyoushouldcertainlygiveitseriousthought.
Meanwhile,SystemVinitliveson.Itiswellunderstood,andthereareinitscriptsalreadyinexistenceforeverycomponentthatisimportanttous.ItremainsthedefaultinitforthePokydistributionoftheYoctoProject.
Intermsofreducingboottime,systemdisfasterthanSystemVinitforasimilarworkload.However,ifyouarelookingforaveryfastboot,nothingcanbeatasimpleBusyBoxinitwithminimalbootscripts.
Inthenextchapter,IwillturnmyattentiontothepowermanagementofLinuxsystemswiththeaimofshowinghowtoreduceenergyconsumption.Thiswillbeespeciallyusefulifyouaredesigningdevicesthatrunonbatterypower.
ManagingPowerFordevicesoperatingonbatterypower,powermanagementisacriticalissue:anythingwecandotoreducepowerusagewillincreasebatterylife.Evenfordevicesrunningonmainspower,reducingpowerusagehasbenefitsinreducingtheneedforcoolingandenergycosts.Inthischapter,Iwillintroducethefourprinciplesofpowermanagement:
Don'trushifyoudon'thavetoDon'tbeashamedofbeingidleTurnoffthingsyouarenotusingSleepwhenthereisnothingelsetodo
Puttingtheseintomoretechnicalterms,theprinciplesmeanthatthepowermanagementsystemshouldendeavortoreducetheCPUclockfrequency;duringidleperiods,itshouldchoosethedeepestsleepstatepossible;itshouldreducetheloadbypoweringdownunusedperipherals;anditshouldbeabletoputthewholesystemintoasuspendstate.
Linuxhasfeaturesthataddresseachofthesepoints.Iwilldescribeeachoneinturn,withexamplesandadviceonhowtoapplythemtoanembeddedsysteminordertomakeoptimumuseofpower.
SomeoftheterminologiesofsystempowermanagementaretakenfromtheAdvancedConfigurationandPowerInterface(ACPI)specification:termssuchasC-statesandP-states.Iwilldescribetheseaswegettothem.ThefullreferencetothespecificationisgivenintheFurtherreadingsection.
Inthischapter,wewillspecificallycoverthefollowingtopics:
MeasuringpowerusageScalingtheclockfrequencySelectingthebestidlestatePoweringdownperipheralsPuttingthesystemtosleep
MeasuringpowerusageFortheexamplesinthischapter,weneedtouserealhardwareratherthanvirtual.ThismeansthatweneedaBeagleBoneBlackwithworkingpowermanagement.Unfortunately,theBSPfortheBeagleBonethatcomeswiththemeta-yocto-bsplayerdoesnotincludethenecessaryfirmwareforthePowerManagementIC(PMIC),sowewillhavetobuildanimagewithacustomizedkernel.TheprocedureisthesameaswecoveredinChapter6,SelectingaBuildSystem.
First,getacopyoftheYoctoProjectMortyrelease,andaddthemeta-bbb-pmlayerfromthecodearchive:
$gitclone-bmortygit://git.yoctoproject.org/poky.git
$cdpoky
$cp-aMELP/chapter_11/poky/meta-bbb-pm
$sourceoe-init-build-envbuild-bbb
$bitbake-layersadd-layer../meta-bbb-pm
Next,editconf/local.confandaddtheselinesatthebeginningofthefile:
MACHINE="beaglebone"
PREFERRED_PROVIDER_virtual/kernel="ti-linux-kernel"
IMAGE_INSTALL_append="powertopdropbearrt-testsamx3-cm3kernel-vmlinux"
Then,buildanimage,forexample,core-image-minimal:
$bitbakecore-image-minimal
Next,formatamicroSDcardandcopytheimagetoitusingthescriptsintheMELParchive:
$MELP/format-sdcard.sh[yoursdcardreader]
$MELP/copy-yoctoproject-image-to-sdcard.sh\
beaglebonecore-image-minimal
Finally,boottheBeagleBoneBlackandcheckwhetherthepowermanagementisworking:
#cat/sys/power/state
freezestandbymemdisk
Ifyouseeallfourstates,everythingisworkingfine.Ifyouseeonlyfreeze,thepowermanagementsubsystemisnotworking.Gobackanddouble-checktheprevioussteps.
Nowwecanmoveontomeasuringpowerusage.Therearetwoapproaches:externalandinternal.Measuringpowerexternally,fromoutsidethesystem,wejustneedanammetertomeasurethecurrentandavoltmetertomeasurethevoltage,andthenmultiplythetwotogetherinordertogetthewattage.Youcanusebasicmetersthatgiveareadout,whichyouthennotedown.Or,theycanbemuchmoresophisticatedandcombinedataloggingsothatyoucanseethechangeinpowerastheloadchangesmillisecondbymillisecond.Forthepurposesofthischapter,IpoweredtheBeagleBonefromtheminiUSBportandusedacheapUSBpowermonitorofthetypethatcostsafewdollars.
TheotherapproachistousethemonitoringsystemsthatarebuiltintoLinux.Youwillfindthatplentyofinformationisreportedtoyouviasysfs.ThereisalsoaveryusefulprogramcalledPowerTOP,whichgathersinformationtogetherfromvarioussourcesandpresentsitinasingleplace.PowerTOPisapackageforbothYoctoProjectandBuildroot.YoumaynoticethatIhaveincludedpowertopasanadditionalpackageintheYoctoProjectconfigurationIprovidedatthebeginningofthissection.HereisanexampleofPowerTOPrunningontheBeagleBoneBlack:
Inthisscreenshot,wecanseethatthesystemisquiet,withonly0.4%ofCPUusage.Iwillshowmoreinterestingexampleslateron.
ScalingtheclockfrequencyRunningforakilometertakesmoreenergythanwalking.Inasimilarway,mayberunningtheCPUatalowerfrequencycansaveenergy.Let'ssee.
ThepowerconsumptionofaCPUwhenexecutingcodeisthesumofastaticcomponent,causedbygateleakagecurrent,amongotherthings,andadynamiccomponent,causedbyswitchingofthegates:
Pcpu=Pstatic+Pdyn
Thedynamicpowercomponentisdependentonthetotalcapacitanceofthelogicgatesbeingswitched,theclockfrequency,andthesquareofthevoltage:
Pdyn=CfV2
Fromthis,wecanseethatchangingthefrequencybyitselfisnotgoingtosaveanypowerbecausethesamenumberofCPUcycleshavetobecompletedinordertoexecuteagivensubroutine.Ifwereducethefrequencybyhalf,itwilltaketwiceaslongtocompletethecalculation,butthetotalpowerconsumedduetothedynamicpowercomponentwillbethesame.Infact,reducingthefrequencymayactuallyincreasethepowerbudgetbecauseittakeslongerfortheCPUtoenteranidlestate.So,intheseconditions,itisbesttousethehighestfrequencypossiblesothattheCPUcangobacktoidlequickly.Thisiscalledtheracetoidle.
Thereisanothermotivationtoreducefrequency:thermalmanagementbecomenecessarytooperateatalowerfrequencyjusttokeepthetemperatureofthepackagewithinbounds.Butthatisnotourfocushere.
Therefore,ifwewanttosavepower,wehavetobeabletochangethevoltagethattheCPUcoreoperatesat.Butforanygivenvoltage,thereisamaximumfrequencybeyondwhichtheswitchingofthegatesbecomeunreliable.Higherfrequenciesneedhighervoltages,andsothetwoneedtobeadjustedtogether.ManySoCsimplementsuchafeature:itiscalledDynamicVoltageandFrequencyScaling,orDVFS.Manufacturerscalculateoptimumcombinations
ofcorefrequencyandvoltage.EachcombinationiscalledOperatingPerformancePoint,orOPP.TheACPIspecificationreferstothemasP-states,withP0beingtheOPPwiththehighestfrequency.AlthoughanOPPisacombinationofafrequencyandavoltage,theyaremostoftenreferredtobythefrequencycomponentalone.
TheCPUFreqdriverLinuxhasacomponentnamedCPUFreqthatmanagesthetransitionsbetweenOPPs.ItispartoftheboardsupportforthepackageforeachSoC.CPUFreqconsistsofdriversindrivers/cpufreq/,whichmakethetransitionfromoneOPPtoanother,andasetofgovernorsthatimplementthepolicyofwhentoswitch.Itiscontrolledper-CPUviathe/sys/devices/system/cpu/cpuN/cpufreqdirectory,withNbeingtheCPUnumber.Inthere,wefindanumberoffiles,themostinterestingofwhichareasfollows:
cpuinfo_cur_freq,cpuinfo_max_freqandcpuinfo_min_freq:ThecurrentfrequencyforthisCPU,togetherwiththemaximumandminimum,measuredinKHz.cpuinfo_transition_latency:Thetime,innanoseconds,toswitchfromoneOPPtoanother.Ifthevalueisunknown,itissetto-1.scaling_available_frequencies:AlistofOPPfrequenciesavailableonthisCPU.scaling_available_governors:AlistofgovernorsavailableonthisCPU,whicharedescribedasfollows:
scaling_governor:TheCPUFreqgovernorcurrentlybeingused.scaling_max_freqandscaling_min_freq:TherangeoffrequenciesavailabletothegovernorinKHz.scaling_setspeed:Afilethatallowsyoutomanuallysetthefrequencywhenthegovernorisuserspace,whichIwilldescribeinthefollowingsection.
ThegovernorsetsthepolicytochangetheOPP.Itcansetthefrequencybetweenthelimitsofscaling_min_freqandscaling_max_freq.Thegovernorsarenamedasfollows:
powersave:Alwaysselectsthelowestfrequency.performance:Alwaysselectsthehighestfrequency.ondemand:ChangesfrequencybasedontheCPUutilization.IftheCPUisidlelessthan20%ofthetime,itsetsthefrequencytothemaximum;ifitisidlemorethan30%ofthetime,itdecrementsthefrequencyby5%.conservative:Asondemand,butswitchestohigherfrequenciesin5%stepsratherthangoingimmediatelytothemaximum.
userspace:Frequencyissetbyauserspaceprogram.
TheparameterstheondemandgovernorusestodecidewhentochangeOPPcanbeviewedandmodifiedvia/sys/devices/system/cpu/cpufreq/ondemand/.Bothondemandandconservativegovernorstakeintoaccounttheeffortrequiredtochangefrequencyandvoltage.Thisparameterisincpuinfo_transition_latency.Thesecalculationsareforthreadswithanormalschedulingpolicy;ifthethreadisbeingscheduledinreal-time,theywillbothimmediatelyselectthehighestOPPsothatthethreadcanmeetitsschedulingdeadline.
TheuserspacegovernorallowsthelogicofselectingtheOPPtobeperformedbyauserspacedaemon.Examplesincludecpudynandpowernowd,althoughbothareorientatedtowardx86-basedlaptopsratherthanembeddeddevices.
UsingCPUFreqLookingattheBeagleBoneBlack,wefindthattheOPPsarecodedinthedevicetree.Hereisanextractfromam33xx.dtsi:
cpu0_opp_table:opp-table{
compatible="operating-points-v2-ti-cpu";
syscon=<&scm_conf>;
opp50@300000000{
opp-hz=/bits/64<300000000>;
opp-microvolt=<950000931000969000>;
opp-supported-hw=<0x060x0010>;
opp-suspend;
};
opp100@600000000{
opp-hz=/bits/64<600000000>;
opp-microvolt=<110000010780001122000>;
opp-supported-hw=<0x060x0040>;
};
opp120@720000000{
opp-hz=/bits/64<720000000>;
opp-microvolt=<120000011760001224000>;
opp-supported-hw=<0x060x0080>;
};
oppturbo@800000000{
opp-hz=/bits/64<800000000>;
opp-microvolt=<126000012348001285200>;
opp-supported-hw=<0x060x0100>;
};
oppnitro@1000000000{
opp-hz=/bits/64<1000000000>;
opp-microvolt=<132500012985001351500>;
opp-supported-hw=<0x040x0200>;
};
};
WecanconfirmthatthesearetheOPPsinuseatruntimebyviewingtheavailablefrequencies:
#cd/sys/devices/system/cpu/cpu0/cpufreq
#catscaling_available_frequencies
3000006000007200008000001000000
Byselectingtheuserspacegovernor,wecansetthefrequencybywritingtoscaling_setspeed,andsowecanmeasurethepowerconsumedateachOPP.Thesemeasurementsarenotveryaccurate,sodonottakethemtooseriously.
First,withanidlesystem,[email protected]=320mW.Thisisindependentofthefrequency,whichiswhatwewouldexpectsincethisisthestaticcomponent
ofthepowerconsumptionofthisparticularsystem.
Now,IwanttoknowthemaximumpowerconsumedateachOPPbyrunningacompute-boundloadsuchasthis:
#ddif=/dev/urandomof=/dev/nullbs=1
Theresultsareshowninthefollowingtable,withDeltapowerbeingtheadditionalpowerusageabovetheidlesystem:
OPP Freq,KHz Power,mW Deltapower,mW
OPP50 300,000 370 50
OPP100 600,000 505 185
OPP120 720,000 600 280
Turbo 800,000 640 320
Nitro 1,000,000 780 460
ThesemeasurementsshowthemaximumpoweratthevariousOPPs.ButitisnotafairtestbecausetheCPUisrunningat100%,andsoitisexecutingmoreinstructionsathigherfrequencies.Ifwekeeptheloadconstantbutvarythefrequency,thenwefindthefollowing:
OPP Freq,KHz CPUutilization,% Power,mW
OPP50 300,000 94 320
OPP100 600,000 48 345
OPP120 720,000 40 370
Turbo 800,000 34 370
Nitro 1,000,000 28 370
Thisshowsadefinitepowersavingatthelowestfrequency,intheorderof15%.
UsingPowerTOP,wecanseethepercentageoftimespentineachOPP.ThefollowingscreenshotshowstheBeagleBoneBlackrunningalightloadandusingtheondemandgovernor:
Inmostcases,theondemandgovernoristhebestonetouse.Toselectaparticulargovernor,youcaneitherconfigurethekernelwithadefaultgovernor,forexample,CPU_FREQ_DEFAULT_GOV_ONDEMAND,oryoucanuseabootscripttochangethegovernoratboottime.ThereisanexampleSystemV(SysV)initscriptinMELP/chapter_11/sysvinit-ondemand.sh,takenfromUbuntu14.04.
FormoreinformationontheCPU-freqdriver,takealookatthekernelsourcecodeintheDocumentation/cpu-freqdirectory.
SelectingthebestidlestateIntheprecedingsection,wewereconcernedaboutthepowerusedwhentheCPUisbusy.Inthissection,wewilllookathowtosavepowerwhentheCPUisidle.
Whenaprocessorhasnomoreworktodo,itexecutesahaltinstructionandentersanidlestate.Whileidle,theCPUuseslesspower.Itexitstheidlestatewhenaneventsuchasahardwareinterruptionoccurs.MostCPUshavemultipleidlestatesthatusevaryingamountsofpower.Usually,thereisatrade-offbetweenthepowerusageandthelatency,orthelengthoftime,ittakestoexitthestate.IntheACPIspecification,theyarecalledC-states.
InthedeeperC-states,morecircuitryisturnedoffattheexpenseoflosingsomestate,andsoittakeslongertoreturntonormaloperation.Forexample,insomeC-statestheCPUcachesmaybepoweredoff,andsowhentheCPUrunsagain,itmayhavetoreloadsomeinformationfromthemainmemory.Thisisexpensive,andsoyouonlywanttodothisifthereisagoodchancethattheCPUwillremaininthisstateforsometime.Thenumberofstatesvariesfromonesystemtoanother.Eachtakessometimetorecoverfromsleepingtobeingfullyactive.
ThekeytoselectingtherightidlestateistohaveagoodideaofhowlongtheCPUisgoingtobequiescent.Predictingthefutureisalwaystricky,buttherearesomethingsthatcanhelp.OneisthecurrentCPUload:ifitishighnow,itislikelytocontinuetobesointheimmediatefuture,soadeepsleepwouldnotbebeneficial.Eveniftheloadislow,itisworthlookingtoseewhetherthereisatimereventthatexpiressoon.Ifthereisnoloadandnotimer,thenadeeperidlestateisjustified.
ThepartofLinuxthatselectsthebestidlestateistheCPUIdledriver.ThereisagooddealofinformationaboutitinthekernelsourcecodeintheDocumentation/cpuidledirectory.
TheCPUIdledriverAswiththeCPUFreqsubsystem,CPUIdleconsistsofadriverthatispartoftheBSPandagovernorthatdeterminesthepolicy.UnlikeCPUFreq,however,thegovernorcannotbechangedatruntimeandthereisnointerfaceforuserspacegovernors.
CPUIdleexposesinformationabouteachoftheidlestatesinthe/sys/devices/system/cpu/cpu0/cpuidledirectory,inwhichthereisasubdirectoryforeachofthesleepstates,namedstate0tostateN.state0isthelightestsleepandstateNthedeepest.NotethatthenumberingdoesnotmatchthatoftheC-statesandthatCPUIdledoesnothaveastateequivalenttoC0(running).Foreachstate,therearethesefiles:
desc:Ashortdescriptionofthestatedisable:Anoptiontodisablethisstatebywriting1tothisfilelatency:ThetimetheCPUcoretakestoresumenormaloperationwhenexitingthisstate,inmicrosecondsname:Thenameofthisstatepower:Thepowerconsumedwhileinthisidlestate,inmilliwattstime:Thetotaltimespentinthisidlestate,inmicrosecondsusage:Thecountofthenumberoftimesthisstatewasentered
InthecaseoftheAM335xSoContheBeagleBoneBlack,therearetwoidlestates.Thisisthefirst:
#cd/sys/devices/system/cpu/cpu0/cpuidle
#grep""state0/*
state0/desc:ARMWFI
state0/disable:0
state0/latency:1
state0/name:WFI
state0/power:4294967295
state0/residency:1
state0/time:1780015
state0/usage:2159
ThisstateisnamedWFI,whichreferstotheARMhaltinstruction,WaitForInterrupt.Thelatencyis1microsecondbecauseitisjustahaltinstruction,andthepowerconsumedisgivenas-1,whichmeansthatthepowerbudgetisnot
known(byCPUIdleatleast).Nowthisisthesecondstate:
#cd/sys/devices/system/cpu/cpu0/cpuidle
#grep""state1/*
state1/desc:BypassMPUPLL
state1/disable:0
state1/latency:100
state1/name:C1
state1/power:497
state1/residency:200
state1/time:8763012731
state1/usage:345285
ThisoneisnamedC1.Ithasahigherlatencyof100microseconds,butarealpowerlevelisgivenof497milliwatts,whichseemsalittlehightome.TheidlestatesmaybehardcodedintotheCPUIdledriverorpresentedinthedevicetree.TheAM335xdoestheformer,sohereisanexamplefromadifferentSoC:
cpus{
cpu:cpu0{
compatible="arm,cortex-a9";
enable-method="ti,am4372";
device-type="cpu";
reg=<0>;
cpu-idle-states=<&mpu_gate>;
};
idle-states{
compatible="arm,idle-state";
entry-latency-us=<40>;
exit-latency-us=<100>;
min-residency-us=<300>;
local-timer-stop;
};
};
CPUIdlehastwogovernors:
ladder:Thisstepsidlestatesdownorup,oneatatime,dependingonthetimespentinthelastidleperiod.Itworkswellwitharegulartimertickbutnotwithadynamictick.menu:Thisselectsanidlestatebasedontheexpectedidletime.Itworkswellwithdynamicticksystems.
YoushouldchooseoneortheotherdependingonyourconfigurationofNOHZ,whichIwilldescribeattheendofthissection.
Onceagain,userinteractionisviathesysfsfilesystem.Inthe/sys/devices/system/cpu/cpuidledirectory,youwillfindtwofiles:
current_driver:Thisisthenameofthecpuidledrivercurrent_governor_ro:Thisisthenameofthegovernor
Theseshowwhichdriverandwhichgovernorarebeingused.TheidlestatescanbeshowninPowerTOPontheIdlestatstab.ThefollowingscreenshotshowsaBeagleBoneBlackusingthemenugovernor:
Thisshowsthatwhenthesystemisidle,itismostlygoingtothedeeperidlestate,C1,whichiswhatwewouldwant.
TicklessoperationArelatedtopicisthetickles,orNOHZ,option.Ifthesystemistrulyidle,themostlikelysourceofinterruptionswillbethesystemtimer,whichisprogrammedtogeneratearegulartimetickatarateofHZpersecond,whereHZistypically100.Historically,Linuxusesthetimertickasthemaintimebaseformeasuringtime-outs.
AndyetitisplainlywastefultowaketheCPUuptoprocessatimerinterruptionifnotimereventsareregisteredforthatparticularmoment.Thedynamictickkernelconfigurationoption,CONFIG_NO_HZ,looksatthetimerqueueattheendofthetimerprocessingroutineandschedulesthenextinterruptionatthetimeofthenextevent,avoidingunnecessarywake-upsandallowingtheCPUtobeidleforlongerperiods.Inanypower-sensitiveapplication,thekernelshouldbeconfiguredwiththisoptionenabled.
PoweringdownperipheralsThediscussionuptonowhasbeenaboutCPUsandhowtoreducepowerconsumptionwhentheyarerunningoridling.Nowitistimetofocusonotherpartsofthesystemperipheralsandseewhetherwecanachievepowersavingshere.
IntheLinuxkernel,thisismanagedbytheruntimepowermanagementsystem,orruntimepmforshort.Itworkswithdriversthatsupportruntimepm,shuttingdownthosethatarenotinuseandwakingthemagainwhentheyarenextneeded.Itisdynamicandshouldbetransparenttouserspace.Itisuptothedevicedrivertoimplementthemanagementofthehardware,buttypically,itwouldincludeturningofftheclocktothesubsystem,alsoknownasclockgating,andturningoffcorecircuitrywherepossible.
Theruntimepowermanagementisexposedviaasysfsinterface.Eachdevicehasasubdirectorynamedpower,inwhichyouwillfindthesefiles:
control:Thisallowsuserspacetodeterminewhetherruntimepmisusedonthisdevice.Ifitissettoauto,thenruntimepmisenabled,butbysettingittoon,thedeviceisalwaysonanddoesnotuseruntimepm.runtime_enabled:Thisreportsthatruntimepmisenabled,disabled,or,ifcontrolison,itreportsforbidden.runtime_status:Thisreportsthecurrentstateofthedevice.Itmaybeactive,suspended,orunsupportedautosuspend_delay_ms:Thisisthetimebeforethedeviceissuspended.-1meanswaitingforever.Somedriversimplementthisifthereisasignificantcosttosuspendingthedevicehardwaresinceitpreventsrapidsuspend/resumecycles.
Togiveaconcreteexample,IwilllookattheMMCdriverontheBeagleBoneBlack:
#cd/sys/devices/platform/ocp/481d8000.mmc/
mmc_host/mmc1/mmc1:0001/power
#grep""*
async:disabled
autosuspend_delay_ms:3000
control:auto
runtime_active_kids:0
runtime_active_time:5170
runtime_enabled:enabled
runtime_status:suspended
runtime_suspended_time:137560
runtime_usage:0
So,runtimepmisenabled,thedeviceiscurrentlysuspended,andthereisadelayof3000millisecondsafteritwaslastusedbeforeitwillbesuspendedagain.NowIreadablockfromthedeviceandseewhetherithaschanged:
#ddif=/dev/mmcblk1p3of=/dev/nullcount=1
1+0recordsin
1+0recordsout
#grep""*
async:disabled
autosuspend_delay_ms:3000
control:auto
runtime_active_kids:0
runtime_active_time:7630
runtime_enabled:enabled
runtime_status:active
runtime_suspended_time:200680
runtime_usage:0
NowMMCdriverisactiveandthepowertotheboardhasincreasedfrom320mWto500mW.IfIrepeatitagainafter3seconds,itisoncemoresuspendedandthepowerhasreturnedto320mW.
Formoreinformationonruntimepm,lookinthekernelsourcecodeatDocumentation/power/runtime_pm.txt.
PuttingthesystemtosleepThereisonemorepowermanagementtechniquetoconsider:puttingthewholesystemintosleepmodewiththeexpectationthatitwillnotbeusedagainforawhile.IntheLinuxkernel,thisisknownassystemsleep.Itisusuallyuser-initiated:theuserdecidesthatthedeviceshouldbeshutdownforawhile.Forexample,Ishutthelidofmylaptopandputitinmybagwhenitistimetogohome.MuchofthesupportforsystemsleepinLinuxcomesfromthesupportforlaptops.Inthelaptopworld,thereareusuallytwooptions:suspendorhibernate.Thefirst,alsoknownassuspendtoRAM,shutseverythingdownexceptthesystemmemory,sothemachineisstillconsumingalittlepower.Whenthesystemwakesup,thememoryretainsallthepreviousstate,andmylaptopisoperationalwithinafewseconds.IfIselectthehibernateoption,thecontentsofmemoryaresavedtotheharddrive.Thesystemconsumesnopoweratall,andsoitcanstayinthisstateindefinitely,butonwake-up,ittakessometimetorestorethememoryfromdisk.Hibernateisveryseldomusedinembeddedsystems,mostlybecausetheflashstoragetendstobequiteslowonread/writebutalsobecauseitisintrusivetotheflowofwork.
Formoreinformation,lookatthekernelsourcecodeintheDocumentation/powerdirectory.
PowerstatesIntheACPIspecification,thesleepstatesarecalledS-states.Linuxsupportsfoursleepstates,whichareshowninthefollowingtable,alongwiththecorrespondingACPIS-state:
Linuxsystemsleepstate
ACPI
S-state
Description
freeze [S0]
Stops(freezes)allactivityinuserspace,butotherwisetheCPUandmemoryareoperatingasnormal.
Thepowersavingresultsfromthefactthatnouserspacecodeisbeingrun.ACPIdoesn'thaveanequivalentstate:S0istheclosestmatch.S0isthestateforarunningsystem.
standby S1 Justlikefreeze,butadditionallytakesallCPUsofflineexceptthebootCPU.
mem S3 Powersdownthesystemandputthememoryintheself-refreshmode.AlsoknownassuspendtoRAM.
disk S4 Savesthememorytotheharddiskandpowersdown.Alsoknownassuspendtodisk.
Notallsystemshavesupportforallstates.Youcanfindoutwhichareavailablebyreadingthe/sys/power/statefile,forexample:
#cat/sys/power/state
freezestandbymemdisk
Toenteroneofthesystemsleepstates,youjusthavetowritethedesiredstateto/sys/power/state.
Forembeddeddevices,themostcommonneedistosuspendtoRAMusingthememoption.Forexample,IcansuspendtheBeagleBoneBlacklikethis:
#echomem>/sys/power/state
[1646.158274]PM:Syncingfilesystems...done.
[1646.178387]Freezinguserspaceprocesses...(elapsed0.001seconds)done.
[1646.188098]Freezingremainingfreezabletasks...
(elapsed0.001seconds)done.
[1646.197017]Suspendingconsole(s)(use
no_console_suspendtodebug)
[1646.338657]PM:suspendofdevicescomplete
after134.322msecs
[1646.343428]PM:latesuspendofdevices
completeafter4.716msecs
[1646.348234]PM:noirqsuspendofdevices
completeafter4.755msecs
[1646.348251]Disablingnon-bootCPUs...
[1646.348264]PM:Successfullyputall
powerdomainstotargetstate
Thedevicepowersdowninlessthanasecondandthenpowerusagedropsdowntobelow10milliwatts,whichisthelimitofmeasurementofmysimplemultimeter.ButhowdoIwakeitupagain?Thatisthenexttopic.
WakeupeventsBeforeyoususpendadevice,youmusthaveamethodofwakingitagain.Thekerneltriestohelpyouhere:ifthereisnotatleastonewakeupsource,thesystemwillrefusetosuspendwiththemessage:
Nosourcesenabledtowake-up!Sleepabort.
Ofcourse,thismeansthatsomepartsofthesystemhavetoremainpoweredonevenduringthedeepestsleep.ThisusuallyinvolvesthePowerManagementIC(PMIC),thereal-timeclock(RTC),andmayadditionallyincludeinterfacessuchasGPIO,UART,andEthernet.
Wakeupeventsarecontrolledthroughsysfs.Eachdevicein/sys/devicehasasubdirectorypowercontainingawakeupfilethatwillcontainoneofthesestrings:
enabled:Thisdevicewillgeneratewakeupeventsdisabled:Thisdevicewillnotgeneratewakeupevents(empty):Thisdeviceisnotcapableofgeneratingwakeupevents
Togetalistofdevicesthatcangeneratewakeups,wecansearchforalldeviceswherewakeupcontainseitherenabledordisabled:
$find/sys/devices-namewakeup|xargsgrep“abled”
InthecaseoftheBeagleBoneBlack,theUARTsarewakeupsources,sopressingakeyontheconsolewakestheBeagleBone:
[1646.348264]PM:WakeupsourceUART
[1646.368482]PM:noirqresumeofdevicescompleteafter19.963msecs
[1646.372482]PM:earlyresumeofdevices
completeafter3.192msecs
[1646.795109]neteth0:initializingcpsw
version1.12(0)
[1646.798229]neteth0:phyfound:idis:
0x7c0f1
[1646.798447]libphy:PHY4a101000.mdio:01not
found
[1646.798469]neteth0:phy4a101000.mdio:01
notfoundonslave1
[1646.927874]PM:resumeofdevicescomplete
after555.337msecs
[1647.003829]Restartingtasks...done.
Timedwakeupsfromthereal-timeclockMostsystemshaveanRTCthatcangeneratealarminterruptionsupto24hoursinthefuture.Ifso,thedirectory/sys/class/rtc/rtc0willexist.Itshouldcontainthewakealarmfile.Writinganumbertowakealarmwillcauseittogenerateanalarmthatnumberofsecondslater.Ifyoualsoenablewakeupeventsfromrtc,itwillresumeasuspendeddevice.Forexample,thiswouldwakethesystemupin30seconds:
#cd/sys/devices/platform/pmic_rtc.1/rtc/rtc0
#echo“+30”>wakealarm
#echo“enabled”>power/wakeup
FurtherreadingAdvancedConfigurationandPowerInterfaceSpecification,version6.1,January2016,http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf.
SummaryLinuxhassophisticatedpowermanagementfunctions.Ihavedescribedfourmaincomponents:
CPU-freqchangestheOperatingPerformancePointofeachprocessorcoretoreducepoweronthosethatarebusybuthavesomebandwidthtospare,andsoallowtheopportunitytoscalethefrequencyback.OPPsareknownasP-StatesintheACPIspecification.CPI-IdleselectsdeeperidlestateswhentheCPUisnotexpectedtobewokenupforawhile.IdlestatesareknownasC-StatesintheACPIspecification.Runtimepowermanagementwillshutdownperipheralsthatarenotneeded.Systemsleepmodeswillputthewholesystemintoalowpowerstate.Theyareusuallyunderendusercontrol,forexample,bypressingastandbybutton.SystemsleepstatesareknownasS-StatesintheACPIspecification.
ThemajorityofthepowermanagementisdoneforyoubytheBSP.Yourmaintaskistomakesurethatitisconfiguredcorrectlyforyourintendedusecases.Onlythelastcomponent,selectingasystemsleepstate,requiresyoutowritesomecodethatwillallowtheendusertoenterandexitthestate.
Inthenextchapter,wewilllookindetailattheLinuxprocessmodelanddescribewhataprocessreallyis,howitrelatestothreads,howtheycooperate,andhowtheyarescheduled.Understandingthesethingsisimportantifyouwanttocreatearobustandmaintainableembeddedsystem.
LearningAboutProcessesandThreadsIntheprecedingchapters,weconsideredthevariousaspectsofcreatinganembeddedLinuxplatform.Nowitistimetostartlookingathowyoucanusetheplatformtocreateaworkingdevice.Inthischapter,IwilltalkabouttheimplicationsoftheLinuxprocessmodelandhowitencompassesmultithreadedprograms.Iwilllookattheprosandconsofusingsingle-threadedandmultithreadedprocesses.Iwillalsolookatschedulinganddifferentiatebetweentimeshareandreal-timeschedulingpolicies.
Whilethesetopicsarenotspecifictoembeddedcomputing,itisimportantforadesignerofanembeddeddevicetohaveanoverviewofthesetopics.Therearemanygoodreferencesonthesubject,someofwhichIlistattheendofthechapter,butingeneral,theydonotconsidertheembeddedusecases.Inconsequence,Iwillbeconcentratingontheconceptsanddesigndecisionsratherthanonthefunctioncallsandcode.
Inthischapter,wewillcoverthefollowingtopics:
Processorthread?Processes.Threads.Scheduling.
Processorthread?Manyembeddeddeveloperswhoarefamiliarwithreal-timeoperatingsystems(RTOS)considertheUnixprocessmodeltobecumbersome.Ontheotherhand,theyseeasimilaritybetweenanRTOStaskandaLinuxthreadandtheyhaveatendencytotransferanexistingdesignusingaone-to-onemappingofRTOStaskstothreads.Ihave,onseveraloccasions,seendesignsinwhichtheentireapplicationisimplementedwithoneprocesscontaining40ormorethreads.Iwanttospendsometimeconsideringwhetherthisisagoodideaornot.Let'sbeginwithsomedefinitions.
Aprocessisamemoryaddressspaceandathreadofexecution,asshowninthefollowingdiagram.Theaddressspaceisprivatetotheprocessandsothreadsrunningindifferentprocessescannotaccessit.Thismemoryseparationiscreatedbythememorymanagementsubsysteminthekernel,whichkeepsamemorypagemappingforeachprocessandre-programsthememorymanagementunitoneachcontextswitch.IwilldescribehowthisworksindetailinChapter13,ManagingMemory.Partoftheaddressspaceismappedtoafilethatcontainsthecodeandstaticdatathattheprogramisrunning,asshownhere:
Astheprogramruns,itwillallocateresourcessuchasstackspace,heapmemory,referencestofiles,andsoon.Whentheprocessterminates,theseresourcesarereclaimedbythesystem:allthememoryisfreedupandallthefiledescriptorsareclosed.
Processescancommunicatewitheachotherusinginter-processcommunication(IPC),suchaslocalsockets.IwilltalkaboutIPClateron.
Athreadisathreadofexecutionwithinaprocess.Allprocessesbeginwithonethreadthatrunsthemain()functionandiscalledthemainthread.Youcancreateadditionalthreads,forexample,usingthePOSIXfunctionpthread_create(3),whichresultsinmultiplethreadsexecutinginthesameaddressspace,asshowninthefollowingdiagram:
Beinginthesameprocess,thethreadsshareresourceswitheachother.Theycanreadandwritethesamememoryandusethesamefiledescriptors.Communicationbetweenthreadsiseasyaslongasyoutakecareofthesynchronizationandlockingissues.
So,basedonthesebriefdetails,youcanimaginetwoextremedesignsforahypotheticalsystemwith40RTOStasksbeingportedtoLinux.
Youcouldmaptaskstoprocessesandhave40individualprogramscommunicatingthroughIPC,forexample,withmessagessentthroughsockets.Youwouldgreatlyreducememorycorruptionproblemssincethemainthreadrunningineachprocessisprotectedfromtheothers,andyouwouldreduceresourceleakagesinceeachprocessiscleanedupafteritexits.However,themessageinterfacebetweenprocessesisquitecomplexand,wherethereistightcooperationbetweenagroupofprocesses,thenumberofmessagesmightbelargeandbecomealimitingfactorintheperformanceofthesystem.Furthermore,anyoneofthe40processesmayterminate,perhapsbecauseofabugcausingittocrash,leavingtheother39tocarryon.Eachprocesswouldhavetohandlethecasethatitsneighborsarenolongerrunningandrecovergracefully.
Attheotherextreme,youcouldmaptaskstothreadsandimplementthesystemasasingleprocesscontaining40threads.Cooperationbecomesmucheasierbecausetheysharethesameaddressspaceandfiledescriptors.Theoverheadof
sendingmessagesisreducedoreliminated,andcontextswitchesbetweenthreadsarefasterthanbetweenprocesses.Thedownsideisthatyouhaveintroducedthepossibilityofonetaskcorruptingtheheaporthestackofanother.Ifanyoneofthethreadsencountersafatalbug,thewholeprocesswillterminate,takingallthethreadswithit.Finally,debuggingacomplexmultithreadedprocesscanbeanightmare.
Theconclusionyoushoulddrawisthatneitherdesignisidealandthatthereisabetterway.Butbeforewegettothatpoint,IwilldelvealittlemoredeeplyintotheAPIsandthebehaviorofprocessesandthreads.
ProcessesAprocessholdstheenvironmentinwhichthreadscanrun:itholdsthememorymappings,thefiledescriptors,theuserandgroupIDs,andmore.Thefirstprocessistheinitprocess,whichiscreatedbythekernelduringbootandhasaPIDofone.Thereafter,processesarecreatedbyduplicationinanoperationknownasforking.
CreatinganewprocessThePOSIXfunctiontocreateaprocessisfork(2).Itisanoddfunctionbecauseforeachsuccessfulcall,therearetworeturns:oneintheprocessthatmadethecall,knownastheParent,andoneinthenewlycreatedprocess,knownastheChild,asshowninthefollowingdiagram:
Immediatelyafterthecall,thechildisanexactcopyoftheparent:ithasthesamestack,thesameheap,thesamefiledescriptors,anditexecutesthesamelineofcode,theonefollowingfork.Theonlywaytheprogrammercantellthemapartisbylookingatthereturnvalueoffork:itiszeroforthechildandgreaterthanzerofortheparent.Actually,thevaluereturnedtotheparentisthePIDofthenewlycreatedchildprocess.Thereisathirdpossibility,whichisthatthereturnvalueisnegative,whichmeansthattheforkcallfailedandthereisstillonlyoneprocess.
Althoughthetwoprocessesareinitiallyidentical,theyareinseparateaddressspaces.Changesmadetoavariablebyonewillnotbeseenbytheother.Underthehood,thekerneldoesnotmakeaphysicalcopyoftheparent'smemory,whichwouldbequiteaslowoperationandconsumememoryunnecessarily.Instead,thememoryissharedbutmarkedwithacopy-on-write(CoW)flag.Ifeitherparentorchildmodifiesthismemory,thekernelfirstmakesacopyandthenwritestothecopy.Thishasthebenefitofanefficientforkfunctionwhileretainingthelogicalseparationofprocessaddressspaces.IwilldiscussCoWinChapter13,ManagingMemory.
TerminatingaprocessAprocessmaybestoppedvoluntarilybycallingtheexit(3)functionor,involuntarily,byreceivingasignalthatisnothandled.Onesignal,inparticular,SIGKILL,cannotbehandledandsowillalwayskillaprocess.Inallcases,terminatingtheprocesswillstopallthreads,closeallfiledescriptors,andreleaseallmemory.Thesystemsendsasignal,SIGCHLD,totheparentsothatitknowsthishashappened.
Processeshaveareturnvaluethatiscomposedofeithertheargumenttoexit,ifitterminatednormally,orthesignalnumberifitwaskilled.Thechiefuseforthisisinshellscripts:itallowsyoutotestthereturnvaluefromaprogram.Byconvention,0indicatessuccessandothervaluesindicateafailureofsomesort.
Theparentcancollectthereturnvaluewiththewait(2)orwaitpid(2)functions.Thiscausesaproblem:therewillbeadelaybetweenachildterminatinganditsparentcollectingthereturnvalue.Inthatperiod,thereturnvaluemustbestoredsomewhere,andthePIDnumberofthenowdeadprocesscannotbereused.Theprocessinthisstateisazombie,whichisdisplayedasstateZinthepsandtopcommands.Aslongastheparentcallswaitorwaitpidwheneveritisnotifiedofachild'stermination(bymeansoftheSIGCHLDsignal;refertoLinuxSystemProgramming,RobertLove,O'ReillyMediaorTheLinuxProgrammingInterface,MichaelKerrisk,NoStarchPressfordetailsonhandlingsignals).Usuallyzombiesexistfortooshortatimetoshowupinprocesslistings.Buttheywillbecomeaproblemiftheparentfailstocollectthereturnvaluebecauseeventually,therewillnotbeenoughresourcestocreateanymoreprocesses.
TheprograminMELP/chapter_12/fork-demoillustratesprocesscreationandtermination:
#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>
intmain(void)
{
intpid;
intstatus;
pid=fork();
if(pid==0){
printf("Iamthechild,PID%d\n",getpid());
sleep(10);
exit(42);
}elseif(pid>0){
printf("Iamtheparent,PID%d\n",getpid());
wait(&status);
printf("Childterminated,status%d\n",WEXITSTATUS(status));
}else
perror("fork:");
return0;
}
Thewaitfunctionblocksuntilachildprocessexitsandstorestheexitstatus.Whenyourunit,youseesomethinglikethis:
Iamtheparent,PID13851
Iamthechild,PID13852
Childterminatedwithstatus42
Thechildprocessinheritsmostoftheattributesoftheparent,includingtheuserandgroupIDs,allopenfiledescriptors,signalhandling,andschedulingcharacteristics.
RunningadifferentprogramTheforkfunctioncreatesacopyofarunningprogram,butitdoesnotrunadifferentprogram.Forthat,youneedoneoftheexecfunctions:
intexecl(constchar*path,constchar*arg,...);
intexeclp(constchar*file,constchar*arg,...);
intexecle(constchar*path,constchar*arg,
...,char*constenvp[]);
intexecv(constchar*path,char*constargv[]);
intexecvp(constchar*file,char*constargv[]);
intexecvpe(constchar*file,char*constargv[],
...,char*constenvp[]);
Eachtakesapathtotheprogramfiletoloadandrun.Ifthefunctionsucceeds,thekerneldiscardsalltheresourcesofthecurrentprocess,includingmemoryandfiledescriptors,andallocatesmemorytothenewprogrambeingloaded.Whenthethreadthatcalledexec*returns,itreturnsnottothelineofcodeafterthecallbuttothemain()functionofthenewprogram.ThereisanexampleofacommandlauncherinMELP/chapter_12/exec-demo:itpromptsforacommand,forexample,/bin/ls,andforksandexecutesthestringyouenter.Hereisthecode:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>
intmain(void)
{
charcommand_str[128];
intpid;
intchild_status;
intwait_for=1;
while(1){
printf("sh>");
scanf("%s",command_str);
pid=fork();
if(pid==0){
/*child*/
printf("cmd'%s'\n",command_str);
execl(command_str,command_str,(char*)NULL);
/*Weshouldnotreturnfromexecl,soonlyget
tothislineifitfailed*/
perror("exec");
exit(1);
}
if(wait_for){
waitpid(pid,&child_status,0);
printf("Done,status%d\n",child_status);
}
}
return0;
}
Hereiswhatyouwillseewhenyourunit:
#./exec-demo
sh>/bin/ls
cmd'/bin/ls'
binetclost+foundprocsysvar
boothomemediaruntmp
devlibmntsbinusr
Done,status0
sh>
YouterminatetheprogrambytypingCtrl-C.
Itmightseemoddtohaveonefunctionthatduplicatesanexistingprocessandanotherthatdiscardsitsresourcesandloadsadifferentprogramintomemory,especiallysinceitiscommonforaforktobefollowedalmostimmediatelybyoneoftheexecfunctions.Mostoperatingsystemscombinethetwoactionsintoasinglecall.
Therearedistinctadvantages,however.Forexample,itmakesitveryeasytoimplementredirectionandpipesintheshell.Imaginethatyouwanttogetadirectorylisting.Thisisthesequenceofevents:
1. Youtypelsintheshellprompt.2. Theshellforksacopyofitself.3. Thechildexecs/bin/ls.4. Thelsprogramprintsthedirectorylistingtostdout(filedescriptor1),which
isattachedtotheTerminal.Youseethedirectorylisting.5. Thelsprogramterminatesandtheshellregainscontrol.
Nowimaginethatyouwantthedirectorylistingtobewrittentoafilebyredirectingtheoutputusingthe>character.Nowthesequenceisasfollows:
1. Youtypels>listing.txt.2. Theshellforksacopyofitself.3. Thechildopensandtruncatesthelisting.txtfileandusesdup2(2)tocopythe
filedescriptorofthefileoverfiledescriptor1(stdout).4. Thechildexecs/bin/ls.5. Theprogramprintsthelistingasbefore,butthistime,itiswritingto
listing.txt.6. Thelsprogramterminatesandtheshellregainscontrol.
Therewasanopportunityinstepthreetomodifytheenvironmentofthechildprocessbeforeexecutingtheprogram.Thelsprogramdoesnotneedtoknowthatitiswritingtoafileratherthanaterminal.Insteadofafile,stdoutcouldbeconnectedtoapipeandsothelsprogram,stillunchanged,cansendoutputtoanotherprogram.ThisispartoftheUnixphilosophyofcombiningmanysmallcomponentsthateachdoajobwell,asdescribedinTheArtofUnixProgramming,byEricStevenRaymond,AddisonWesley(23Sept,2003)ISBN978-0131429017,especiallyinthePipes,Redirection,andFilterssection.
DaemonsWehaveencountereddaemonsinseveralplacesalready.Adaemonisaprocessthatrunsinthebackground,ownedbytheinitprocessandnotconnectedtoacontrollingTerminal.Thestepstocreateadaemonareasfollows:
1. Callforktocreateanewprocess,afterwhichtheparentshouldexit,thuscreatinganorphanwhichwillbere-parentedtoinit.
2. Thechildprocesscallssetsid(2),creatinganewsessionandprocessgroupofwhichitisthesolemember.Theexactdetailsdonotmatterhere;youcansimplyconsiderthisawayofisolatingtheprocessfromanycontrollingterminal.
3. Changetheworkingdirectorytotherootdirectory.4. Closeallfiledescriptorsandredirectstdin,stdout,andstderr(descriptors0,1,
and2)to/dev/nullsothatthereisnoinputandalloutputishidden.
Thankfully,alloftheprecedingstepscanbeachievedwithasinglefunctioncall,daemon(3).
Inter-processcommunicationEachprocessisanislandofmemory.Youcanpassinformationfromonetoanotherintwoways.Firstly,youcancopyitfromoneaddressspacetotheother.Secondly,youcancreateanareaofmemorythatbothcanaccessandsharethedata.
Thefirstisusuallycombinedwithaqueueorbuffersothatthereisasequenceofmessagespassingbetweenprocesses.Thisimpliescopyingthemessagetwice:firsttoaholdingareaandthentothedestination.Someexamplesofthisaresockets,pipes,andmessagequeues.
Thesecondwayrequiresnotonlyamethodofcreatingmemorythatismappedintotwo(ormore)addressspacesatonce,butitisalsoameansofsynchronizingaccesstothatmemory,forexample,usingsemaphoresormutexes.
POSIXhasfunctionsforallofthese.ThereisanoldersetofAPIsknownasSystemVIPC,whichprovidesmessagequeues,sharedmemory,andsemaphores,butitisnotasflexibleasthePOSIXequivalentssoIwillnotdescribethemhere.Themanualpageonsvipc(7)givesanoverviewofthefacilities,andtherearemoredetailsinTheLinuxProgrammingInterface,byMichaelKerrisk,andUnixNetworkProgramming,Volume2,byW.RichardStevens.
Message-basedprotocolsareusuallyeasiertoprogramanddebugthansharedmemorybutareslowifthemessagesarelargeormany.
Message-basedIPCThereareseveraloptions,whichIwillsummarizeasfollows.Theattributesthatdifferentiateonefromtheotherareasfollows:
Whetherthemessageflowisuni-orbi-directorial.Whetherthedataflowisabytestreamwithnomessageboundaryordiscretemessageswithboundariespreserved.Inthelattercase,themaximumsizeofamessageisimportant.Whethermessagesaretaggedwithapriority.
ThefollowingtablesummarizesthesepropertiesforFIFOs,sockets,andmessagequeues:
Property FIFOUnixsocket:stream
Unixsocket:datagram POSIXmessagequeue
Messageboundary
Bytestream
Bytestream Discrete Discrete
Uni/bi-directional Uni Bi Uni Uni
Maxmessagesize
Unlimited Unlimited Intherangeof100KiBto250KiB
Default:8KiB,absolutemaximum:1MiB
Prioritylevels None None None 0to32767
Unix(orlocal)socketsUnixsocketsfulfillmostrequirementsandcoupledwiththefamiliarityofthesocketsAPI,theyarebyfarthemostcommonmechanism.
UnixsocketsarecreatedwiththeAF_UNIXaddressfamilyandboundtoapathname.Accesstothesocketisdeterminedbytheaccesspermissionofthesocketfile.Aswithinternetsockets,thesockettypecanbeSOCK_STREAMorSOCK_DGRAM,theformergivingabidirectionalbytestreamandthelatterprovidingdiscretemessageswithpreservedboundaries.Unixsocketdatagramsarereliable,whichmeansthattheywillnotbedroppedorreordered.Themaximumsizeforadatagramissystem-dependentandisavailablevia/proc/sys/net/core/wmem_max.Itistypically100KiBormore.
Unixsocketsdonothaveamechanismtoindicatethepriorityofamessage.
FIFOsandnamedpipesFIFOandnamedpipearejustdifferenttermsforthesamething.Theyareanextensionoftheanonymouspipethatisusedtocommunicatebetweenparentandchildprocesseswhenimplementingpipesintheshell.
AFIFOisaspecialsortoffile,createdbythemkfifo(1)command.AswithUnixsockets,thefileaccesspermissionsdeterminewhocanreadandwrite.Theyareunidirectional,whichmeansthatthereisonereaderandusuallyonewriter,thoughtheremaybeseveral.Thedataisapurebytestreambutwithaguaranteeoftheatomicityofmessagesthataresmallerthanthebufferassociatedwiththepipe.Inotherwords,writeslessthanthissizewillnotbesplitintoseveralsmallerwritesandsoyouwillreadthewholemessageinonegoaslongasthesizeofthebufferatyourendislargeenough.ThedefaultsizeoftheFIFObufferis64KiBonmodernkernelsandcanbeincreasedusingfcntl(2)withF_SETPIPE_SZuptothevaluein/proc/sys/fs/pipe-max-size,typically1MiB.Thereisnoconceptofpriority.
POSIXmessagequeuesMessagequeuesareidentifiedbyaname,whichmustbeginwithaforwardslash/andcontainonlyone/character:messagequeuesareactuallykeptinapseudofilesystemofthetypemqueue.Youcreateaqueueandgetareferencetoanexistingqueuethroughmq_open(3),whichreturnsafiledescriptor.Eachmessagehasapriority,andmessagesarereadfromthequeuebasedonpriorityandthenontheageorder.Messagescanbeupto/proc/sys/kernel/msgmaxbyteslong.
Thedefaultvalueis8KiB,butyoucansetittobeanysizeintherange128bytesto1MiBbywritingthevalueto/proc/sys/kernel/msgmaxbytes.Eachmessagehasapriority.Theyarereadfromthequeuebasedontheprioritythentheageorder.Sincethereferenceisafiledescriptor,youcanuseselect(2),poll(2),andothersimilarfunctionstowaitforactivityinthequeue.
RefertotheLinuxmainpagemq_overview(7)formoredetail.
Summaryofmessage-basedIPCUnixsocketsareusedmostoftenbecausetheyofferallthatisneeded,exceptperhapsmessagepriority.Theyareimplementedonmostoperatingsystems,andsotheyconfermaximumportability.
FIFOsarelessused,mostlybecausetheylackanequivalenttoadatagram.Ontheotherhand,theAPIisverysimple,beingthenormalopen(2),close(2),read(2),andwrite(2)filecalls.
Messagequeuesaretheleastcommonlyusedofthisgroup.Thecodepathsinthekernelarenotoptimizedinthewaythatsocket(network)andFIFO(filesystem)callsare.
Therearealsohigher-levelabstractions,inparticular,D-Bus,whicharemovingfrommainstreamLinuxtoembeddeddevices.D-BususesUnixsocketsandsharedmemoryunderthesurface.
Sharedmemory-basedIPCSharingmemoryremovestheneedforcopyingdatabetweenaddressspaces,butintroducestheproblemofsynchronizingaccessestoit.Synchronizationbetweenprocessesiscommonlyachievedusingsemaphores.
POSIXsharedmemoryTosharememorybetweenprocesses,youfirsthavetocreateanewareaofmemoryandthenmapittotheaddressspaceofeachprocessthatwantsaccesstoit,asshowninthefollowingdiagram:
ThenamingofPOSIXsharedmemorysegmentsfollowsthepatternweencounteredwithmessagequeues.Thesegmentsareidentifiedbynamesthatbeginwitha/characterandhaveexactlyonesuchcharacter.Theshm_open(3)functiontakesthenameandreturnsafiledescriptorforit.IfitdoesnotexistalreadyandtheO_CREATflagisset,thenanewsegmentiscreated.Initially,ithasasizeofzero.Youcanusethe(misleadinglynamed)ftruncate(2)functiontoexpandittothedesiredsize.
Onceyouhaveadescriptorforthesharedmemory,youmapittotheaddressspaceoftheprocessusingmmap(2),andsothreadsindifferentprocessescanaccessthememory.
TheprograminMELP/chapter_12/shared-mem-demogivesanexampleofusingasharedmemorysegmenttocommunicatebetweenprocesses.Hereisthecode:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<unistd.h>
#include<sys/mman.h>
#include<sys/stat.h>/*Formodeconstants*/
#include<fcntl.h>
#include<sys/types.h>
#include<errno.h>
#include<semaphore.h>
#defineSHM_SEGMENT_SIZE65536
#defineSHM_SEGMENT_NAME"/demo-shm"
#defineSEMA_NAME"/demo-sem"
staticsem_t*demo_sem;
/*
*Ifthesharedmemorysegmentdoesnotexistalready,createit
*ReturnsapointertothesegmentorNULLifthereisanerror
*/
staticvoid*get_shared_memory(void)
{
intshm_fd;
structshared_data*shm_p;
/*Attempttocreatethesharedmemorysegment*/
shm_fd=shm_open(SHM_SEGMENT_NAME,O_CREAT|O_EXCL|O_RDWR,
0666);
if(shm_fd>0){
/*succeeded:expandittothedesiredsize(Note:dont'tdo
"thiseverytimebecauseftruncatefillsitwithzeros)*/
printf("Creatingsharedmemoryandsettingsize=%d\n",
SHM_SEGMENT_SIZE);
if(ftruncate(shm_fd,SHM_SEGMENT_SIZE)<0){
perror("ftruncate");
exit(1);
}
/*Createasemaphoreaswell*/
demo_sem=sem_open(SEMA_NAME,O_RDWR|O_CREAT,0666,1);
if(demo_sem==SEM_FAILED)
perror("sem_openfailed\n");
}
elseif(shm_fd==-1&&errno==EEXIST){
/*Alreadyexists:openagainwithoutO_CREAT*/
shm_fd=shm_open(SHM_SEGMENT_NAME,O_RDWR,0);
demo_sem=sem_open(SEMA_NAME,O_RDWR);
if(demo_sem==SEM_FAILED)
perror("sem_openfailed\n");
}
if(shm_fd==-1){
perror("shm_open"SHM_SEGMENT_NAME);
exit(1);
}
/*Mapthesharedmemory*/
shm_p=mmap(NULL,SHM_SEGMENT_SIZE,PROT_READ|PROT_WRITE,
MAP_SHARED,shm_fd,0);
if(shm_p==NULL){
perror("mmap");
exit(1);
}
returnshm_p;
}
intmain(intargc,char*argv[])
{
char*shm_p;
printf("%sPID=%d\n",argv[0],getpid());
shm_p=get_shared_memory();
while(1){
printf("Pressentertoseethecurrentcontentsofshm\n");
getchar();
sem_wait(demo_sem);
printf("%s\n",shm_p);
/*Writeoursignaturetothesharedmemory*/
sprintf(shm_p,"Hellofromprocess%d\n",getpid());
sem_post(demo_sem);
}
return0;
}
Theprogramusesasharedmemorysegmenttocommunicateamessagefromoneprocesstoanother.ThemessageistheHellofromprocessstringfollowedbyitsPID.Theget_shared_memoryfunctionisresponsibleforcreatingthememorysegment,ifitdoesnotexist,orgettingthefiledescriptorforitifitdoes.Itreturnsapointertothememorysegment.Inthemainfunction,thereisasemaphoretosynchronizeaccesstothememorysothatoneprocessdoesnotoverwritethemessagefromanother.
Totryitout,youneedtwoinstancesoftheprogramrunninginseparateterminalsessions.Inthefirstterminal,youwillseesomethinglikethis:
#./shared-mem-demo
./shared-mem-demoPID=271
Creatingsharedmemoryandsettingsize=65536
Pressentertoseethecurrentcontentsofshm
Pressentertoseethecurrentcontentsofshm
Hellofromprocess271
Becausethisisthefirsttimetheprogramisrun,itcreatesthememorysegment.Initially,themessageareaisempty,butafteronerunthroughtheloop,itcontainsthePIDofthisprocess,271.Now,youcanrunasecondinstanceinanotherterminal:
#./shared-mem-demo
./shared-mem-demoPID=279
Pressentertoseethecurrentcontentsofshm
Hellofromprocess271
Pressentertoseethecurrentcontentsofshm
Hellofromprocess279
Itdoesnotcreatethesharedmemorysegmentbecauseitexistsalready,anditdisplaysthemessagethatitcontainsalready,whichisPIDoftheotherprogram.PressingEntercausesittowriteitsownPID,whichthefirstprogramwouldbe
abletosee.Inthisway,thetwoprogramscancommunicatewitheachother.
ThePOSIXIPCfunctionsarepartofthePOSIXreal-timeextensions,andsoyouneedtolinkwithlibrt.Oddly,thePOSIXsemaphoresareimplementedinthePOSIXthreadslibrary,soyouneedtolinkwiththepthreadslibraryaswell.Hencethecompilationargumentsareasfollows:
$arm-cortex_a8-linux-gnueabihf-gccshared-mem-demo.c-lrt-pthread\
-oarm-cortex_a8-linux-gnueabihf-gcc
ThreadsNowitistimetolookatmultithreadedprocesses.TheprogramminginterfaceforthreadsisthePOSIXthreadsAPI,whichwasfirstdefinedintheIEEEPOSIX1003.1cstandard(1995)andiscommonlyknownaspthreads.ItisimplementedasanadditionalpartoftheC-library,libpthread.so.Therehavebeentwoimplementationsofpthreadsoverthelast15yearsorso:LinuxThreadsandNativePOSIXThreadLibrary(NPTL).Thelatterismuchmorecompliantwiththespecification,particularlywithregardtothehandlingofsignalsandprocessIDs.Itisprettydominantnow,butyoumaycomeacrosssomeolderversionsofuClibcthatuseLinuxThreads.
CreatinganewthreadThefunctiontocreateathreadispthread_create(3):
intpthread_create(pthread_t*thread,constpthread_attr_t*attr,
void*(*start_routine)(void*),void*arg);
Itcreatesanewthreadofexecutionthatbeginsinthefunctionstart_routineandplacesadescriptorinthepthread_tpointedtobythread.Itinheritstheschedulingparametersofthecallingthread,butthesecanbeoverriddenbypassingapointertothethreadattributesinattr.Thethreadwillbegintoexecuteimmediately.
pthread_tisthemainwaytorefertothethreadwithintheprogram,butthethreadcanalsobeseenfromoutsideusingacommandsuchasps-eLf:
UIDPIDPPIDLWPCNLWPSTIMETTYTIMECMD
...
chris6072564860720321:18pts/000:00:00./thread-demo
chris6072564860730321:18pts/000:00:00./thread-demo
Intheoutputfrompsshownabove,theprogramthread-demohastwothreads.ThePIDandPPIDcolumnsshowthattheyallbelongtothesameprocessandhavethesameparent,asyouwouldexpect.ThecolumnmarkedLWPisinteresting,though.LWPstandsforLightWeightProcess,which,inthiscontext,isanothernameforathread.ThenumbersinthatcolumnarealsoknownasThreadIDsorTIDs.Inthemainthread,theTIDisthesameasthePID,butfortheothers,itisadifferent(higher)value.YoucanuseaTIDinplaceswherethedocumentationstatesthatyoumustgiveaPID,butbeawarethatthisbehaviorisspecifictoLinuxandisnotportable.Hereisasimpleprogramthatillustratesthelifecycleofathread(thecodeisinMELP/chapter_12/thread-demo):
#include<stdio.h>
#include<unistd.h>
#include<pthread.h>
#include<sys/syscall.h>
staticvoid*thread_fn(void*arg)
{
printf("Newthreadstarted,PID%dTID%d\n",
getpid(),(pid_t)syscall(SYS_gettid));
sleep(10);
printf("Newthreadterminating\n");
returnNULL;
}
intmain(void)
{
pthread_tt;
printf("Mainthread,PID%dTID%d\n",
getpid(),(pid_t)syscall(SYS_gettid));
pthread_create(&t,NULL,thread_fn,NULL);
pthread_join(t,NULL);
return0;
}
Notethatinthefunctionthread_fnIamretrievingtheTIDusingsyscall(SYS_gettid).Thereisamanualpageforgettid(2),whichexplainsthatyouhavetocallLinuxdirectlythroughasyscallbecausethereisnoC-librarywrapperforit.
Thereisalimittothetotalnumberofthreadsthatagivenkernelcanschedule.Thelimitscalesaccordingtothesizeofthesystem,fromaround1,000onsmalldevicesuptotensofthousandsonlargerembeddeddevices.Theactualnumberisavailablein/proc/sys/kernel/threads-max.Onceyoureachthislimit,forkandpthread_createwillfail.
TerminatingathreadAthreadterminateswhen:
Itreachestheendofitsstart_routineItcallspthread_exit(3)Itiscanceledbyanotherthreadcallingpthread_cancel(3)Theprocessthatcontainsthethreadterminates,forexample,becauseofathreadcallingexit(3),ortheprocessreceivesasignalthatisnothandled,masked,orignored
Notethatifamultithreadedprogramcallsfork,onlythethreadthatmadethecallwillexistinthenewchildprocess.Forkdoesnotreplicateallthreads.
Athreadhasareturnvalue,whichisavoidpointer.Onethreadcanwaitforanothertoterminateandcollectitsreturnvaluebycallingpthread_join(2).Thereisanexampleinthecodeforthread-demo,mentionedintheprecedingsection.Thisproducesaproblemthatisverysimilartothezombieproblemamongprocesses:theresourcesofthethread,forexample,thestack,cannotbefreedupuntilanotherthreadhasjoinedwithit.Ifthreadsremainunjoined,thereisaresourceleakintheprogram.
CompilingaprogramwiththreadsThesupportforPOSIXthreadsispartoftheC-libraryinthelibpthread.solibrary.However,thereismoretobuildingprogramswiththreadsthanlinkingthelibrary:therehavetobechangestothewaythecompilergeneratescodetomakesurethatcertainglobalvariables,suchaserrno,haveoneinstanceperthreadratherthanoneforthewholeprocess.
Whenbuildingathreadedprogram,youmustaddthe-pthreadswitchinthecompileandlinkstages.
Inter-threadcommunicationThebigadvantageofthreadsisthattheysharetheaddressspaceandcansharememoryvariables.Thisisalsoabigdisadvantagebecauseitrequiressynchronizationtopreservedataconsistencyinamannersimilartomemorysegmentssharedbetweenprocessesbutwiththeprovisothat,withthreads,allmemoryisshared.Infact,threadscancreateprivatememoryusingthreadlocalstorage(TLS),butIwillnotcoverthathere.
Thepthreadsinterfaceprovidesthebasicsnecessarytoachievesynchronization:mutexesandconditionvariables.Ifyouwantmorecomplexstructures,youwillhavetobuildthemyourself.
ItisworthnotingthatalloftheIPCmethodsdescribedearlier,thatissockets,pipesandmessagequeues,workequallywellbetweenthreadsinthesameprocess.
MutualexclusionTowriterobustprograms,youneedtoprotecteachsharedresourcewithamutexlock,andmakesurethateverycodepaththatreadsorwritestheresourcehaslockedthemutexfirst.Ifyouapplythisruleconsistently,mostoftheproblemsshouldbesolved.Theonesthatremainareassociatedwiththefundamentalbehaviorofmutexes.Iwilllistthembrieflyherebutwillnotgointodetail:
Deadlock:Thisoccurswhenmutexesbecomepermanentlylocked.Aclassicsituationisthedeadlyembraceinwhichtwothreadseachrequiretwomutexesandhavemanagedtolockoneofthembutnottheother.Eachthreadblocks,waitingforthelocktheotherhas,andsotheyremainastheyare.Onesimpleruletoavoidthedeadlyembraceproblemistomakesurethatmutexesarealwayslockedinthesameorder.Othersolutionsinvolvetimeoutsandback-offperiods.Priorityinversion:Thedelayscausedbywaitingforamutexcancauseareal-timethreadtomissdeadlines.Thespecificcaseofpriorityinversionhappenswhenahighprioritythreadbecomesblockedwaitingforamutexlockedbyalowprioritythread.Ifthelowprioritythreadispreemptedbyotherthreadsofintermediatepriority,thehighprioritythreadisforcedtowaitforanunboundedlengthoftime.Therearemutexprotocolscalledpriorityinheritanceandpriorityceilingthatresolvetheproblemattheexpenseofgreaterprocessingoverheadinthekernelforeachlockandunlockcall.Poorperformance:Mutexesintroduceminimaloverheadtothecodeaslongasthreadsdon'thavetoblockonthemmostofthetime.Ifyourdesignhasaresourcethatisneededbyalotofthreads,however,thecontentionratiobecomessignificant.Thisisusuallyadesignissuethatcanberesolvedusingfinergrainedlockingoradifferentalgorithm.
ChangingconditionsCooperatingthreadsneedamethodofalertingoneanotherthatsomethinghaschangedandneedsattention.Thatthingiscalledaconditionandthealertissentthroughaconditionvariable,orcondvar.
Aconditionisjustsomethingthatyoucantesttogiveatrueorfalseresult.Asimpleexampleisabufferthatcontainseitherzeroorsomeitems.Onethreadtakesitemsfromthebufferandsleepswhenitisempty.Anotherthreadplacesitemsintothebufferandsignalstheotherthreadthatithasdonesobecausetheconditionthattheotherthreadiswaitingonhaschanged.Ifitissleeping,itneedstowakeupanddosomething.Theonlycomplexityisthattheconditionis,bydefinition,asharedresourceandsohastobeprotectedbyamutex.
Hereisasimpleprogramwithtwothreads.Thefirstistheproducer:itwakeseverysecondandputsanitemofdataintoaglobalvariableandthensignalsthattherehasbeenachange.Thesecondthreadistheconsumer:itwaitsontheconditionvariableandteststhecondition(thatthereisastringinthebufferofnonzerolength)eachtimeitwakesup.YoucanfindthecodeinMELP/chapter_12/condvar-demo:
#include<stdio.h>
#include<stdlib.h>
#include<pthread.h>
#include<unistd.h>
#include<string.h>
charg_data[128];
pthread_cond_tcv=PTHREAD_COND_INITIALIZER;
pthread_mutex_tmutx=PTHREAD_MUTEX_INITIALIZER;
void*consumer(void*arg)
{
while(1){
pthread_mutex_lock(&mutx);
while(strlen(g_data)==0)
pthread_cond_wait(&cv,&mutx);
/*Gotdata*/
printf("%s\n",g_data);
/*Truncatetonullstringagain*/
g_data[0]=0;
pthread_mutex_unlock(&mutx);
}
returnNULL;
}
void*producer(void*arg)
{
inti=0;
while(1){
sleep(1);
pthread_mutex_lock(&mutx);
sprintf(g_data,"Dataitem%d",i);
pthread_mutex_unlock(&mutx);
pthread_cond_signal(&cv);
i++;
}
returnNULL;
}
Notethatwhentheconsumerthreadblocksonthecondvar,itdoessowhileholdingalockedmutex,whichwouldseemtobearecipefordeadlockthenexttimetheproducerthreadtriestoupdatethecondition.Toavoidthis,pthread_condwait(3)unlocksthemutexafterthethreadisblocked,andthenlocksitagainbeforewakingitandreturningfromthewait.
PartitioningtheproblemNowthatwehavecoveredthebasicsofprocessesandthreadsandthewaysinwhichtheycommunicate,itistimetoseewhatwecandowiththem.
HerearesomeoftherulesIusewhenbuildingsystems:
Rule1:Keeptasksthathavealotofinteractiontogether:Itisimportanttominimizeoverheadsbykeepingcloselyinter-operatingthreadstogetherinoneprocess.Rule2:Don'tputallyourthreadsinonebasket:Ontheotherhand,tryandkeepcomponentswithlimitedinteractioninseparateprocesses,intheinterestsofresilienceandmodularity.Rule3:Don'tmixcriticalandnoncriticalthreadsinthesameprocess:ThisisanamplificationofRule2:thecriticalpartofthesystem,whichmightbeamachinecontrolprogram,shouldbekeptassimpleaspossibleandwritteninamorerigorouswaythanotherparts.Itmustbeabletocontinueevenifotherprocessesfail.Ifyouhavereal-timethreads,bydefinition,theymustbecriticalandshouldgointoaprocessbythemselves.Rule4:Threadsshouldn'tgettoointimate:Oneofthetemptationswhenwritingamultithreadedprogramistointerminglethecodeandvariablesbetweenthreadsbecauseitisallinoneprogramandeasytodo.Keepthethreadsmodular,withwell-definedinteractions.Rule5:Don'tthinkthatthreadsareforfree:Itisveryeasytocreateadditionalthreads,butthereisacost,notleastintheadditionalsynchronizationnecessarytocoordinatetheiractivities.Rule6:Threadscanworkinparallel:Threadscanrunsimultaneouslyonamulticoreprocessor,givinghigherthroughput.Ifyouhavealargecomputingjob,youcancreateonethreadpercoreandmakemaximumuseofthehardware.Therearelibrariestohelpyoudothis,suchasOpenMP.Youshouldprobablynotbecodingparallelprogrammingalgorithmsfromscratch.
TheAndroiddesignisagoodillustration.EachapplicationisaseparateLinuxprocessthathelpsmodularizememorymanagementandensuresthatoneapp
crashingdoesnotaffectthewholesystem.Theprocessmodelisalsousedforaccesscontrol:aprocesscanonlyaccessthefilesandresourcesthatitsUIDandGIDsallowitto.Thereareagroupofthreadsineachprocess.Thereisonetomanageandupdatetheuserinterface,onetohandlesignalsfromtheoperatingsystem,severaltomanagedynamicmemoryallocationandthefreeingupofJavaobjects,andaworkerpoolofatleasttwothreadstoreceivemessagesfromotherpartsofthesystemusingtheBinderprotocol.
Tosummarize,processesprovideresiliencebecauseeachprocesshasaprotectedmemoryspace,andwhentheprocessterminates,allresourcesincludingmemoryandfiledescriptorsarefreedup,reducingresourceleaks.Ontheotherhand,threadsshareresourcesandcancommunicateeasilythroughsharedvariablesandcancooperatebysharingaccesstofilesandotherresources.Threadsgiveparallelismthroughworkerpoolsandotherabstractions,whichisusefulinmulticoreprocessors.
SchedulingThesecondbigtopicIwanttocoverinthischapterisscheduling.TheLinuxschedulerhasaqueueofthreadsthatarereadytorun,anditsjobistoschedulethemonCPUsastheybecomeavailable.Eachthreadhasaschedulingpolicythatmaybetime-sharedorreal-time.Thetime-sharedthreadshaveanicenessvaluethatincreasesorreducestheirentitlementtoCPUtime.Thereal-timethreadshaveaprioritysuchthatahigherprioritythreadwillpreemptalowerone.Theschedulerworkswiththreads,notprocesses.Eachthreadisscheduledregardlessofwhichprocessitisrunningin.
Theschedulerrunswhen:
Athreadisblockedbycallingsleep()oranotherblockingsystemcallAtime-sharedthreadexhaustsitstimesliceAninterruptioncausesathreadtobeunblocked,forexample,becauseofI/Ocompleting
ForbackgroundinformationontheLinuxscheduler,IrecommendthatyoureadthechapteronprocessschedulinginLinuxKernelDevelopment,3rdeditionbyRobertLove.
FairnessversusdeterminismIhavegroupedtheschedulingpoliciesintocategoriesoftime-sharedandreal-time.Time-sharedpoliciesarebasedontheprincipaloffairness.Theyaredesignedtomakesurethateachthreadgetsafairamountofprocessortimeandthatnothreadcanhogthesystem.Ifathreadrunsfortoolong,itisputtothebackofthequeuesothatotherscanhaveago.Atthesametime,afairnesspolicyneedstoadjusttothreadsthataredoingalotofworkandgivethemtheresourcestogetthejobdone.Time-sharedschedulingisgoodbecauseofthewayitautomaticallyadjuststoawiderangeofworkloads.
Ontheotherhand,ifyouhaveareal-timeprogram,fairnessisnothelpful.Instead,youthenwantapolicythatisdeterministic,whichwillgiveyouatleastminimalguaranteesthatyourreal-timethreadswillbescheduledattherighttimesothattheydon'tmisstheirdeadlines.Thismeansthatareal-timethreadmustpreempttime-sharedthreads.Real-timethreadsalsohaveastaticprioritythattheschedulercanusetochoosebetweenthemwhenthereareseveralofthemtorunatonce.TheLinuxreal-timeschedulerimplementsafairlystandardalgorithmthatrunsthehighestpriorityreal-timethread.MostRTOSschedulersarealsowritteninthisway.
Bothtypesofthreadcancoexist.Thoserequiringdeterministicschedulingarescheduledfirstandthetimeremainingisdividedbetweenthetime-sharedthreads.
Time-sharedpoliciesTime-sharedpoliciesaredesignedforfairness.FromLinux2.6.23onward,theschedulerusedhasbeenCompletelyFairScheduler(CFS).Itdoesnotusetimeslicesinthenormalsenseoftheword.Instead,itcalculatesarunningtallyofthelengthoftimeathreadwouldbeentitledtorunifithaditsfairshareofCPUtime,anditbalancesthatwiththeactualamountoftimeithasrunfor.Ifitexceedsitsentitlementandthereareothertime-sharedthreadswaitingtorun,theschedulerwillsuspendthethreadandrunawaitingthreadinstead.
Thetime-sharedpoliciesareasfollows:
SCHED_NORMAL(alsoknownasSCHED_OTHER):Thisisthedefaultpolicy.ThevastmajorityofLinuxthreadsusethispolicy.SCHED_BATCH:ThisissimilartoSCHED_NORMALexceptthatthreadsarescheduledwithalargergranularity;thatis,theyrunforlongerbuthavetowaitlongeruntiltheyarescheduledagain.Theintentionistoreducethenumberofcontextswitchesforbackgroundprocessing(batchjobs)andreducetheamountofCPUcachechurn.SCHED_IDLE:Thesethreadsarerunonlywhentherearenothreadsofanyotherpolicyreadytorun.Itisthelowestpossiblepriority.
Therearetwopairsoffunctionstogetandsetthepolicyandpriorityofathread.ThefirstpairtakesaPIDasaparameterandaffectsthemainthreadinaprocess:
structsched_param{
...
intsched_priority;
...
};
intsched_setscheduler(pid_tpid,intpolicy,
conststructsched_param*param);
intsched_getscheduler(pid_tpid);
Thesecondpairoperatesonpthread_tandcanchangetheparametersoftheotherthreadsinaprocess:
intpthread_setschedparam(pthread_tthread,intpolicy,
conststructsched_param*param);
intpthread_getschedparam(pthread_tthread,int*policy,
structsched_param*param);
NicenessSometime-sharedthreadsaremoreimportantthanothers.Youcanindicatethiswiththenicevalue,whichmultipliesathread'sCPUentitlementbyascalingfactor.Thenamecomesfromthefunctioncall,nice(2),whichhasbeenpartofUnixsincetheearlydays.Athreadbecomesnicebyreducingitsloadonthesystem,ormovesintheoppositedirectionbyincreasingit.Therangeofvaluesisfrom19,whichisreallynice,to-20,whichisreallynotnice.Thedefaultvalueis0,whichisaveragelynice,orso-so.
ThenicevaluecanbechangedforSCHED_NORMALandSCHED_BATCHthreads.Toreduceniceness,whichincreasestheCPUload,youneedtheCAP_SYS_NICEcapability,whichisavailabletotherootuser.
Almostallthedocumentationforfunctionsandcommandsthatchangethenicevalue(nice(2)andtheniceandrenicecommands)talksintermsofprocesses.However,itreallyrelatestothreads.Asmentionedintheprecedingsection,youcanuseaTIDinplaceofaPIDtochangethenicevalueofanindividualthread.Oneotherdiscrepancyinthestandarddescriptionsofniceisthis:thenicevalueisreferredtoasthepriorityofathread(orsometimes,mistakenly,aprocess).Ibelievethisismisleadingandconfusestheconceptwithreal-timepriority,whichisacompletelydifferentthing.
Real-timepoliciesReal-timepoliciesareintendedfordeterminism.Thereal-timeschedulerwillalwaysrunthehighestpriorityreal-timethreadthatisreadytorun.Real-timethreadsalwayspreempttimesharethreads.Inessence,byselectingareal-timepolicyoveratimesharepolicy,youaresayingthatyouhaveinsideknowledgeoftheexpectedschedulingofthisthreadandwishtooverridethescheduler'sbuilt-inassumptions.
Therearetworeal-timepolicies:
SCHED_FIFO:Thisisaruntocompletionalgorithm,whichmeansthatoncethethreadstartstorun,itwillcontinueuntilitispreemptedbyahigherpriorityreal-timethread,itisblockedinasystemcall,oruntilitterminates(completes).SCHED_RR:Thisaroundrobinalgorithmthatwillcyclebetweenthreadsofthesamepriorityiftheyexceedtheirtimeslice,whichis100msbydefault.SinceLinux3.9,ithasbeenpossibletocontrolthetimeslicevaluethrough/proc/sys/kernel/sched_rr_timeslice_ms.Apartfromthis,itbehavesinthesamewayasSCHED_FIFO.
Eachreal-timethreadhasapriorityintherange1to99,with99beingthehighest.
Togiveathreadareal-timepolicy,youneedCAP_SYS_NICE,whichisgivenonlytotherootuserbydefault.
Oneproblemwithreal-timescheduling,bothinLinuxandelsewhere,isthatofathreadthatbecomescomputebound,oftenbecauseabughascausedittoloopindefinitely,willpreventreal-timethreadsoflowerpriorityfromrunningalongwithallthetimesharethreads.Inthiscase,thesystembecomeserraticandmaylockupcompletely.Thereareacoupleofwaystoguardagainstthispossibility.
First,sinceLinux2.6.25,theschedulerhas,bydefault,reserved5%ofCPUtimefornon-real-timethreadssothatevenarunawayreal-timethreadcannotcompletelyhaltthesystem.Itisconfiguredviatwokernelcontrols:
/proc/sys/kernel/sched_rt_period_us
/proc/sys/kernel/sched_rt_runtime_us
Theyhavedefaultvaluesof1,000,000(1second)and950,000(950ms),respectively,whichmeansthatoutofeverysecond,50msisreservedfornon-real-timeprocessing.Ifyouwantreal-timethreadstobeabletotake100%,thensetsched_rt_runtime_usto-1.
Thesecondoptionistouseawatchdog,eitherhardwareorsoftware,tomonitortheexecutionofkeythreadsandtakeactionwhentheybegintomissdeadlines.ImentionedwatchdogsinChapter10,StartingUp–TheinitProgram.
ChoosingapolicyInpractice,time-sharedpoliciessatisfythemajorityofcomputingworkloads.ThreadsthatareI/O-boundspendalotoftimeblockedandalwayshavesomespareentitlementinhand.Whentheyareunblocked,theywillbescheduledalmostimmediately.Meanwhile,CPU-boundthreadswillnaturallytakeupanyCPUcyclesleftover.Positivenicevaluescanbeappliedtothelessimportantthreadsandnegativevaluestothemoreimportantones.
Ofcourse,thisisonlyaveragebehavior;therearenoguaranteesthatthiswillalwaysbethecase.Ifmoredeterministicbehaviorisneeded,thenreal-timepolicieswillberequired.Thethingsthatmarkoutathreadasbeingreal-timeareasfollows:
IthasadeadlinebywhichitmustgenerateanoutputMissingthedeadlinewouldcompromisetheeffectivenessofthesystemItisevent-drivenItisnotcompute-bound
Examplesofreal-timetasksincludetheclassicrobotarmservocontroller,multimediaprocessing,andcommunicationprocessing.Iwilldiscussreal-timesystemdesignlateroninChapter16,Real-TimeProgramming.
Choosingareal-timepriorityChoosingreal-timeprioritiesthatworkforallexpectedworkloadsisatrickybusinessandagoodreasontoavoidreal-timepoliciesinthefirstplace.
ThemostwidelyusedprocedureforchoosingprioritiesisknownasRateMonotonicAnalysis(RMA),afterthe1973paperbyLiuandLayland.Itappliestoreal-timesystemswithperiodicthreads,whichisaveryimportantclass.Eachthreadhasaperiodandautilization,whichistheproportionoftheperioditwillbeexecuting.Thegoalistobalancetheloadsothatallthreadscancompletetheirexecutionphasebeforethenextperiod.RMAstatesthatthiscanbeachievedif:
ThehighestprioritiesaregiventothethreadswiththeshortestperiodsThetotalutilizationislessthan69%
Thetotalutilizationisthesumofalloftheindividualutilizations.Italsomakestheassumptionthattheinteractionbetweenthreadsorthetimespentblockedonmutexesandthelikeisnegligible.
FurtherreadingThefollowingresourceshavefurtherinformationonthetopicsintroducedinthischapter:
TheArtofUnixProgramming,byEricStevenRaymond,AddisonWesley;(23Sept,2003)ISBN978-0131429017LinuxSystemProgramming,2ndedition,byRobertLove,O'ReillyMedia;(8Jun,2013)ISBN-10:1449339530LinuxKernelDevelopment,3rdeditionbyRobertLove,Addison-WesleyProfessional;(July2,2010)ISBN-10:0672329468TheLinuxProgrammingInterface,byMichaelKerrisk,NoStarchPress;(October2010)ISBN978-1-59327-220-3UNIXNetworkProgramming:v.2:InterprocessCommunications,2ndEdition,byW.RichardStevens,PrenticeHall;(25Aug,1998)ISBN-10:0132974290ProgrammingwithPOSIXThreads,byButenhof,DavidR,Addison-WesleyProfessionalSchedulingAlgorithmformultiprogramminginaHard-Real-TimeEnvironment,byC.L.LiuandJamesW.Layland,JournalofACM,1973,vol20,no1,pp.46-61
SummaryThelongUnixheritagethatisbuiltintoLinuxandtheaccompanyingClibrariesprovidesalmosteverythingyouneedinordertowritestableandresilientembeddedapplications.Theissueisthatforeveryjob,thereareatleasttwowaystoachievetheendyoudesire.
Inthischapter,Ifocusedontwoaspectsofsystemdesign:partitioningintoseparateprocesses,eachwithoneormorethreadstogetthejobdone,andschedulingofthosethreads.IhopethatIshedsomelightonthisandhavegivenyouthebasisforfurtherstudyintoallofthem.
Inthenextchapter,Iwillexamineanotherimportantaspectofsystemdesign:memorymanagement.
ManagingMemoryThischaptercoversissuesrelatedtomemorymanagement,whichisanimportanttopicforanyLinuxsystembutespeciallyforembeddedLinux,wheresystemmemoryisusuallyinlimitedsupply.Afterabriefrefresheronvirtualmemory,Iwillshowyouhowtomeasurememoryusage,howtodetectproblemswithmemoryallocation,includingmemoryleaks,andwhathappenswhenyourunoutofmemory.Youwillhavetounderstandthetoolsthatareavailable,fromsimpletoolssuchasfreeandtop,tocomplextoolssuchasmtraceandValgrind.
Inthischapter,wewillcoverthefollowingtopics:
Virtualmemorybasics.Kernelspacememorylayout.Userspacememorylayout.Theprocessmemorymap.Swapping.Mappingmemorywithmmap.Howmuchmemorydoesmyapplicationuse?Per-processmemoryusage.Identifyingmemoryleaks.Runningoutofmemory.
VirtualmemorybasicsTorecap,Linuxconfiguresthememorymanagementunit(MMU)oftheCPUtopresentavirtualaddressspacetoarunningprogramthatbeginsatzeroandendsatthehighestaddress,0xffffffff,ona32-bitprocessor.Thisaddressspaceisdividedintopagesof4KiB(therearerareexamplesofsystemsusingotherpagesizes).
Linuxdividesthisvirtualaddressspaceintoanareaforapplications,calleduserspace,andanareaforthekernel,calledkernelspace.ThesplitbetweenthetwoissetbyakernelconfigurationparameternamedPAGE_OFFSET.Inatypical32-bitembeddedsystem,PAGE_OFFSETis0xc0000000,givingthelower3gigabytestouserspaceandthetopgigabytetokernelspace.Theuseraddressspaceisallocatedperprocesssothateachprocessrunsinasandbox,separatedfromtheothers.Thekerneladdressspaceisthesameforallprocesses:thereisonlyonekernel.
PagesinthisvirtualaddressspacearemappedtophysicaladdressesbytheMMU,whichusespagetablestoperformthemapping.
Eachpageofvirtualmemorymaybe:
Unmapped,sothattryingtoaccesstheseaddresseswillresultinaSIGSEGVMappedtoapageofphysicalmemorythatisprivatetotheprocessMappedtoapageofphysicalmemorythatissharedwithotherprocessesMappedandsharedwithacopyonwrite(CoW)flagset:awriteistrappedinthekernel,whichmakesacopyofthepageandmapsittotheprocessinplaceoftheoriginalpagebeforeallowingthewritetotakeplaceMappedtoapageofphysicalmemorythatisusedbythekernel
Thekernelmayadditionallymappagestoreservedmemoryregions,forexample,toaccessregistersandmemorybuffersindevicedrivers.
Anobviousquestionisthis:whydowedoitthiswayinsteadofsimplyreferencingphysicalmemorydirectly,asatypicalRTOSwould?
Therearenumerousadvantagestovirtualmemory,someofwhicharedescribed
here:
InvalidmemoryaccessesaretrappedandapplicationsarealertedbySIGSEGVProcessesrunintheirownmemoryspace,isolatedfromothersEfficientuseofmemorythroughthesharingofcommoncodeanddata,forexample,inlibrariesThepossibilityofincreasingtheapparentamountofphysicalmemorybyaddingswapfiles,althoughswappingonembeddedtargetsisrare
Thesearepowerfularguments,butwehavetoadmitthattherearesomedisadvantagesaswell.Itisdifficulttodeterminetheactualmemorybudgetofanapplication,whichisoneofthemainconcernsofthischapter.Thedefaultallocationstrategyistoover-commit,whichleadstotrickyout-of-memorysituations,whichIwillalsodiscusslateron.Finally,thedelaysintroducedbythememorymanagementcodeinhandlingexceptions—pagefaults—makethesystemlessdeterministic,whichisimportantforreal-timeprograms.IwillcoverthisinChapter16,Real-TimeProgramming.
Memorymanagementisdifferentforkernelspaceanduserspace.Theupcomingsectionsdescribetheessentialdifferencesandthethingsyouneedtoknow.
KernelspacememorylayoutKernelmemoryismanagedinafairlystraightforwardway.Itisnotdemand-paged,whichmeansthatforeveryallocationusingkmalloc()orasimilarfunction,thereisrealphysicalmemory.Kernelmemoryisneverdiscardedorpagedout.
Somearchitecturesshowasummaryofthememorymappingatboottimeinthekernellogmessages.Thistraceistakenfroma32-bitARMdevice(aBeagleBoneBlack):
Memory:511MB=511MBtotal
Memory:505980k/505980kavailable,18308kreserved,0Khighmem
Virtualkernelmemorylayout:
vector:0xffff0000-0xffff1000(4kB)
fixmap:0xfff00000-0xfffe0000(896kB)
vmalloc:0xe0800000-0xff000000(488MB)
lowmem:0xc0000000-0xe0000000(512MB)
pkmap:0xbfe00000-0xc0000000(2MB)
modules:0xbf800000-0xbfe00000(6MB)
.text:0xc0008000-0xc0763c90(7536kB)
.init:0xc0764000-0xc079f700(238kB)
.data:0xc07a0000-0xc0827240(541kB)
.bss:0xc0827240-0xc089e940(478kB)
Thefigureof505980KiBavailableistheamountoffreememorythekernelseeswhenitbeginsexecutionbutbeforeitbeginsmakingdynamicallocations.
Consumersofkernelspacememoryincludethefollowing:
Thekernelitself,inotherwords,thecodeanddataloadedfromthekernelimagefileatboottime.Thisisshownintheprecedingkernelloginthesegments.text,.init,.data,and.bss.The.initsegmentisfreedoncethekernelhascompletedinitialization.Memoryallocatedthroughtheslaballocator,whichisusedforkerneldatastructuresofvariouskinds.Thisincludesallocationsmadeusingkmalloc().Theycomefromtheregionmarkedlowmem.Memoryallocatedviavmalloc(),usuallyforlargerchunksofmemorythanisavailablethroughkmalloc().Theseareinthevmallocarea.Mappingfordevicedriverstoaccessregistersandmemorybelongingtovariousbitsofhardware,whichyoucanseebyreading/proc/iomem.Thesealsocomefromthevmallocarea,butsincetheyaremappedtophysical
memorythatisoutsideofmainsystemmemory,theydonottakeupanyrealmemory.Kernelmodules,whichareloadedintotheareamarkedmodules.Otherlow-levelallocationsthatarenottrackedanywhereelse.
Howmuchmemorydoesthekerneluse?Unfortunately,thereisn'tacompleteanswertothequestion'howmuchmemorydoesthekerneluse,butwhatfollowsisascloseaswecanget.
Firstly,youcanseethememorytakenupbythekernelcodeanddatainthekernellogshownpreviously,oryoucanusethesizecommand,asfollows:
$arm-poky-linux-gnueabi-sizevmlinux
textdatabssdechexfilename
90134487968688428144182384601164bfcvmlinux
Usually,theamountofmemorytakenbythekernelforthestaticcodeanddatasegmentsshownhereissmallwhencomparedtothetotalamountofmemory.Ifthatisnotthecase,youneedtolookthroughthekernelconfigurationandremovethecomponentsthatyoudon'tneed.Thereisanongoingefforttoallowsmallkernelstobebuilt:searchforLinuxKernelTinification.Thereisaprojectpageforitathttps://tiny.wiki.kernel.org/.
Youcangetmoreinformationaboutmemoryusagebyreading/proc/meminfo:
#cat/proc/meminfo
MemTotal:509016kB
MemFree:410680kB
Buffers:1720kB
Cached:25132kB
SwapCached:0kB
Active:74880kB
Inactive:3224kB
Active(anon):51344kB
Inactive(anon):1372kB
Active(file):23536kB
Inactive(file):1852kB
Unevictable:0kB
Mlocked:0kB
HighTotal:0kB
HighFree:0kB
LowTotal:509016kB
LowFree:410680kB
SwapTotal:0kB
SwapFree:0kB
Dirty:16kB
Writeback:0kB
AnonPages:51248kB
Mapped:24376kB
Shmem:1452kB
Slab:11292kB
SReclaimable:5164kB
SUnreclaim:6128kB
KernelStack:1832kB
PageTables:1540kB
NFS_Unstable:0kB
Bounce:0kB
WritebackTmp:0kB
CommitLimit:254508kB
Committed_AS:734936kB
VmallocTotal:499712kB
VmallocUsed:29576kB
VmallocChunk:389116kB
Thereisadescriptionofeachofthesefieldsonthemanualpageproc(5).Thekernelmemoryusageisthesumofthefollowing:
Slab:ThetotalmemoryallocatedbytheslaballocatorKernelStack:ThestackspaceusedwhenexecutingkernelcodePageTables:ThememoryusedtostorepagetablesVmallocUsed:Thememoryallocatedbyvmalloc()
Inthecaseofslaballocations,youcangetmoreinformationbyreading/proc/slabinfo.Similarly,thereisabreakdownofallocationsin/proc/vmallocinfoforthevmallocarea.Inbothcases,youneeddetailedknowledgeofthekernelanditssubsystemsinordertoseeexactlywhichsubsystemismakingtheallocationsandwhy,whichisbeyondthescopeofthisdiscussion.
Withmodules,youcanuselsmodtofindoutthememoryspacetakenupbythecodeanddata:
#lsmod
ModuleSizeUsedby
g_multi476702
libcomposite142991g_multi
mt7601Usta6014040
Thisleavesthelow-levelallocationsofwhichthereisnorecordandthatpreventusfromgeneratinganaccurateaccountofkernelspacememoryusage.Thiswillappearasmissingmemorywhenweaddupallthekernelanduserspaceallocationsthatweknowabout.
UserspacememorylayoutLinuxemploysalazyallocationstrategyforuserspace,onlymappingphysicalpagesofmemorywhentheprogramaccessesit.Forexample,allocatingabufferof1MiBusingmalloc(3)returnsapointertoablockofmemoryaddressesbutnoactualphysicalmemory.Aflagissetinthepagetableentriessuchthatanyreadorwriteaccessistrappedbythekernel.Thisisknownasapagefault.Onlyatthispointdoesthekernelattempttofindapageofphysicalmemoryandaddittothepagetablemappingfortheprocess.Itisworthwhiledemonstratingthiswithasimpleprogram,MELP/chapter_13/pagefault-demo:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<sys/resource.h>
#defineBUFFER_SIZE(1024*1024)
voidprint_pgfaults(void)
{
intret;
structrusageusage;
ret=getrusage(RUSAGE_SELF,&usage);
if(ret==-1){
perror("getrusage");
}else{
printf("Majorpagefaults%ld\n",usage.ru_majflt);
printf("Minorpagefaults%ld\n",usage.ru_minflt);
}
}
intmain(intargc,char*argv[])
{
unsignedchar*p;
printf("Initialstate\n");
print_pgfaults();
p=malloc(BUFFER_SIZE);
printf("Aftermalloc\n");
print_pgfaults();
memset(p,0x42,BUFFER_SIZE);
printf("Aftermemset\n");
print_pgfaults();
memset(p,0x42,BUFFER_SIZE);
printf("After2ndmemset\n");
print_pgfaults();
return0;
}
Whenyourunit,youwillseesomethinglikethis:
Initialstate
Majorpagefaults0
Minorpagefaults172
Aftermalloc
Majorpagefaults0
Minorpagefaults186
Aftermemset
Majorpagefaults0
Minorpagefaults442
After2ndmemset
Majorpagefaults0
Minorpagefaults442
Therewere172minorpagefaultsencounteredafterinitializingtheprogram'senvironmentandafurther14whencallinggetrusage(2)(thesenumberswillvarydependingonthearchitectureandtheversionoftheClibraryyouareusing).Theimportantpartistheincreasewhenfillingthememorywithdata:442-186=256.Thebufferis1MiB,whichis256pages.Thesecondcalltomemset(3)makesnodifferencebecauseallthepagesarenowmapped.
Asyoucansee,apagefaultisgeneratedwhenthekerneltrapsanaccesstoapagethathasnotbeenmapped.Infact,therearetwokindsofpagefaults:minorandmajor.Withaminorfault,thekerneljusthastofindapageofphysicalmemoryandmapittotheprocessaddressspace,asshownintheprecedingcode.Amajorpagefaultoccurswhenthevirtualmemoryismappedtoafile,forexample,usingmmap(2),whichIwilldescribeshortly.Readingfromthismemorymeansthatthekernelnotonlyhastofindapageofmemoryandmapitin,butitalsohastofillitwithdatafromthefile.Consequently,majorfaultsaremuchmoreexpensiveintimeandsystemresources.
TheprocessmemorymapYoucanseethememorymapforaprocessthroughtheprocfilesystem.Asanexample,hereisthemapfortheinitprocess,PID1:
#cat/proc/1/maps
00008000-0000e000r-xp0000000000:0b23281745/sbin/init
00016000-00017000rwxp0000600000:0b23281745/sbin/init
00017000-00038000rwxp0000000000:000[heap]
b6ded000-b6f1d000r-xp0000000000:0b23281695/lib/libc-2.19.so
b6f1d000-b6f24000---p0013000000:0b23281695/lib/libc-2.19.so
b6f24000-b6f26000r-xp0012f00000:0b23281695/lib/libc-2.19.so
b6f26000-b6f27000rwxp0013100000:0b23281695/lib/libc-2.19.so
b6f27000-b6f2a000rwxp0000000000:000
b6f2a000-b6f49000r-xp0000000000:0b23281359/lib/ld-2.19.so
b6f4c000-b6f4e000rwxp0000000000:000
b6f4f000-b6f50000r-xp0000000000:000[sigpage]
b6f50000-b6f51000r-xp0001e00000:0b23281359/lib/ld-2.19.so
b6f51000-b6f52000rwxp0001f00000:0b23281359/lib/ld-2.19.so
beea1000-beec2000rw-p0000000000:000[stack]
ffff0000-ffff1000r-xp0000000000:000[vectors]
Thefirstthreecolumnsshowthestartandendvirtualaddressesandthepermissionsforeachmapping.Thepermissionsareshownhere:
r:Readw:Writex:Executes:Sharedp:Private(copyonwrite)
Ifthemappingisassociatedwithafile,thefilenameappearsinthefinalcolumn,andcolumnsfour,five,andsixcontaintheoffsetfromthestartofthefile,theblockdevicenumber,andtheinodeofthefile.Mostofthemappingsaretotheprogramitselfandthelibrariesitislinkedwith.Therearetwoareaswheretheprogramcanallocatememory,marked[heap]and[stack].Memoryallocatedusingmalloccomesfromtheformer(exceptforverylargeallocations,whichwewillcometolater);allocationsonthestackcomefromthelatter.Themaximumsizeofbothareasiscontrolledbytheprocess'sulimit:
Heap:ulimit-d,defaultunlimitedStack:ulimit-s,default8MiB
AllocationsthatexceedthelimitarerejectedbySIGSEGV.
Whenrunningoutofmemory,thekernelmaydecidetodiscardpagesthataremappedtoafileandareread-only.Ifthatpageisaccessedagain,itwillcauseamajorpagefaultandbereadbackinfromthefile.
SwappingTheideaofswappingistoreservesomestoragewherethekernelcanplacepagesofmemorythatarenotmappedtoafilesothatitcanfreeupthememoryforotheruses.Itincreasestheeffectivesizeofphysicalmemorybythesizeoftheswapfile.Itisnotapanacea:thereisacosttocopyingpagestoandfromaswapfile,whichbecomesapparentonasystemthathastoolittlerealmemoryfortheworkloaditiscarryingandsoswappingbecomesthemainactivity.Thisissometimesknownasdiskthrashing.
Swapisseldomusedonembeddeddevicesbecauseitdoesnotworkwellwithflashstorage,whereconstantwritingwouldwearitoutquickly.However,youmaywanttoconsiderswappingtocompressedRAM(zram).
Swappingtocompressedmemory(zram)ThezramdrivercreatesRAM-basedblockdevicesnamed/dev/zram0,/dev/zram1,andsoon.Pageswrittentothesedevicesarecompressedbeforebeingstored.Withcompressionratiosintherangeof30%to50%,youcanexpectanoverallincreaseinfreememoryofabout10%attheexpenseofmoreprocessingandacorrespondingincreaseinpowerusage.
Toenablezram,configurethekernelwiththeseoptions:
CONFIG_SWAP
CONFIG_CGROUP_MEM_RES_CTLR
CONFIG_CGROUP_MEM_RES_CTLR_SWAP
CONFIG_ZRAM
Then,mountzramatboottimebyaddingthisto/etc/fstab:
/dev/zram0noneswapdefaultszramsize=<sizeinbytes>,
swapprio=<swappartitionpriority>
Youcanturnswaponandoffusingthesecommands:
#swapon/dev/zram0
#swapoff/dev/zram0
MappingmemorywithmmapAprocessbeginslifewithacertainamountofmemorymappedtothetext(thecode)anddatasegmentsoftheprogramfile,togetherwiththesharedlibrariesthatitislinkedwith.Itcanallocatememoryonitsheapatruntimeusingmalloc(3)andonthestackthroughlocallyscopedvariablesandmemoryallocatedthroughalloca(3).Itmayalsoloadlibrariesdynamicallyatruntimeusingdlopen(3).Allofthesemappingsaretakencareofbythekernel.However,aprocesscanalsomanipulateitsmemorymapinanexplicitwayusingmmap(2):
void*mmap(void*addr,size_tlength,intprot,intflags,
intfd,off_toffset);
Thisfunctionmapslengthbytesofmemoryfromthefilewiththedescriptorfd,startingatoffsetinthefile,andreturnsapointertothemapping,assumingitissuccessful.Sincetheunderlyinghardwareworksinpages,lengthisroundeduptothenearestwholenumberofpages.Theprotectionparameter,prot,isacombinationofread,write,andexecutepermissionsandtheflagsparametercontainsatleastMAP_SHAREDorMAP_PRIVATE.Therearemanyotherflags,whicharedescribedinthemainpage.
Therearemanythingsyoucandowithmmap.Iwillshowsomeofthemintheupcomingsections.
UsingmmaptoallocateprivatememoryYoucanusemmaptoallocateanareaofprivatememorybysettingMAP_ANONYMOUSintheflagsparameterandsettingthefiledescriptorfdto-1.Thisissimilartoallocatingmemoryfromtheheapusingmalloc,exceptthatthememoryispage-alignedandinmultiplesofpages.Thememoryisallocatedinthesameareaasthatusedforlibraries.Infact,thisareaisreferredtobysomeasthemmapareaforthisreason.
Anonymousmappingsarebetterforlargeallocationsbecausetheydonotpindowntheheapwithchunksofmemory,whichwouldmakefragmentationmorelikely.Interestingly,youwillfindthatmalloc(inglibcatleast)stopsallocatingmemoryfromtheheapforrequestsover128KiBandusesmmapinthisway,soinmostcases,justusingmallocistherightthingtodo.Thesystemwillchoosethebestwayofsatisfyingtherequest.
UsingmmaptosharememoryAswesawinChapter12,LearningAboutProcessesandThreads,POSIXsharedmemoryrequiresmmaptoaccessthememorysegment.Inthiscase,yousettheMAP_SHAREDflagandusethefiledescriptorfromshm_open():
intshm_fd;
char*shm_p;
shm_fd=shm_open("/myshm",O_CREAT|O_RDWR,0666);
ftruncate(shm_fd,65536);
shm_p=mmap(NULL,65536,PROT_READ|PROT_WRITE,
MAP_SHARED,shm_fd,0);
UsingmmaptoaccessdevicememoryAsImentionedinChapter9,InterfacingwithDeviceDrivers,itispossibleforadrivertoallowitsdevicenodetobemmapedandsharesomeofthedevicememorywithanapplication.Theexactimplementationisdependentonthedriver.
OneexampleistheLinuxframebuffer,/dev/fb0.Theinterfaceisdefinedin/usr/include/linux/fb.h,includinganioctlfunctiontogetthesizeofthedisplayandthebitsperpixel.Youcanthenusemmaptoaskthevideodrivertosharetheframebufferwiththeapplicationandreadandwritepixels:
intf;
intfb_size;
unsignedchar*fb_mem;
f=open("/dev/fb0",O_RDWR);
/*UseioctlFBIOGET_VSCREENINFOtofindthedisplaydimensions
andcalculatefb_size*/
fb_mem=mmap(0,fb_size,PROT_READ|PROT_WRITE,MAP_SHARED,fd,0);
/*readandwritepixelsthroughpointerfb_mem*/
Asecondexampleisthestreamingvideointerface,Video4Linux,version2,orV4L2,whichisdefinedin/usr/include/linux/videodev2.h.Eachvideodevicehasanodenamed/dev/videoN,startingwith/dev/video0.Thereisanioctlfunctiontoaskthedrivertoallocateanumberofvideobuffersthatyoucanmmapintouserspace.Then,itisjustaquestionofcyclingthebuffersandfillingoremptyingthemwithvideodata,dependingonwhetheryouareplayingbackorcapturingavideostream.
Howmuchmemorydoesmyapplicationuse?Aswithkernelspace,thedifferentwaysofallocating,mapping,andsharinguserspacememorymakeitquitedifficulttoanswerthisseeminglysimplequestion.
Tobegin,youcanaskthekernelhowmuchmemoryitthinksisavailable,whichyoucandousingthefreecommand.Hereisatypicalexampleoftheoutput:
totalusedfreesharedbufferscached
Mem:5090165043124704026456363860
-/+buffers/cache:113996395020
Swap:000
Atfirstsight,thislookslikeasystemthatisalmostoutofmemorywithonly4704KiBfreeoutof509,016KiB:lessthan1%.However,notethat26,456KiBisinbuffersandawhopping363,860KiBisincaches.Linuxbelievesthatfreememoryiswastedmemoryandthekernelusesfreememoryforbuffersandcacheswiththeknowledgethattheycanbeshrunkwhentheneedarises.Removingbuffersandcachefromthemeasurementprovidestruefreememory,whichis395,020KiB:77%ofthetotal.Whenusingfree,thenumbersonthesecondlinemarked-/+buffers/cachearetheimportantones.
Youcanforcethekerneltofreeupcachesbywritinganumberbetween1and3to/proc/sys/vm/drop_caches:
#echo3>/proc/sys/vm/drop_caches
Thenumberisactuallyabitmaskthatdetermineswhichofthetwobroadtypesofcachesyouwanttofree:1forthepagecacheand2forthedentryandinodecachescombined.Theexactrolesofthesecachesarenotparticularlyimportanthere,onlythatthereismemorythatthekernelisusingbutthatcanbereclaimedatshortnotice.
Per-processmemoryusageThereareseveralmetricstomeasuretheamountofmemoryaprocessisusing.Iwillbeginwiththetwothatareeasiesttoobtain:thevirtualsetsize(vss)andtheresidentmemorysize(rss),bothofwhichareavailableinmostimplementationsofthepsandtopcommands:
Vss:CalledVSZinthepscommandandVIRTintop,thisisthetotalamountofmemorymappedbyaprocess.Itisthesumofalltheregionsshownin/proc/<PID>/map.Thisnumberisoflimitedinterestsinceonlypartofthevirtualmemoryiscommittedtophysicalmemoryatanytime.Rss:CalledRSSinpsandRESintop,thisisthesumofmemorythatismappedtophysicalpagesofmemory.Thisgetsclosertotheactualmemorybudgetoftheprocess,butthereisaproblem:ifyouaddtheRssofalltheprocesses,youwillgetanoverestimateofthememoryinusebecausesomepageswillbeshared.
UsingtopandpsTheversionsoftopandpsfromBusyBoxprovideverylimitedinformation.Theexamplesthatfollowusethefullversionfromtheprocpspackage.
ThepscommandshowsVss(VSZ)andRss(RSS)withtheoptions-Aly,oryoucanuseacustomformatthatincludesvszandrss,asshownhere:
#ps-eopid,tid,class,rtprio,stat,vsz,rss,comm
PIDTIDCLSRTPRIOSTATVSZRSSCOMMAND
11TS-Ss44962652systemd
[...]
205205TS-Ss40761296systemd-journal
228228TS-Ss25241396udevd
581581TS-Ss28801508avahi-daemon
584584TS-Ss28481512dbus-daemon
590590TS-Ss1332680acpid
594594TS-Ss46001564wpa_supplicant
Likewise,topshowsasummaryofthefreememoryandmemoryusageperprocess:
top-21:17:52up10:04,1user,loadaverage:0.00,0.01,0.05
Tasks:96total,1running,95sleeping,0stopped,0zombie
%Cpu(s):1.7us,2.2sy,0.0ni,95.9id,0.0wa,0.0hi
KiBMem:509016total,278524used,230492free,25572buffers
KiBSwap:0total,0used,0free,170920cached
PIDUSERPRNIVIRTRESSHRS%CPU%MEMTIME+COMMAND
595root200649209.8m4048S0.02.00:01.09node
866root2002889291523660S0.21.80:36.38Xorg
[...]
ThesesimplecommandsgiveyouafeelofthememoryusageandprovidethefirstindicationthatyouhaveamemoryleakwhenyouseethattheRssofaprocesskeepsonincreasing.However,theyarenotveryaccurateintheabsolutemeasurementsofmemoryusage.
UsingsmemIn2009,MattMackallbeganlookingattheproblemofaccountingforsharedpagesinprocessmemorymeasurementandaddedtwonewmetricscalleduniquesetsize,orUss,andproportionalsetsize,orPss:
Uss:Thisistheamountofmemorythatiscommittedtophysicalmemoryandisuniquetoaprocess;itisnotsharedwithanyother.Itistheamountofmemorythatwouldbefreediftheprocessweretoterminate.Pss:Thissplitstheaccountingofsharedpagesthatarecommittedtophysicalmemorybetweenalltheprocessesthathavethemmapped.Forexample,ifanareaoflibrarycodeis12-pageslongandissharedbysixprocesses,eachwillaccumulatetwopagesinPss.Thus,ifyouaddthePssnumbersforallprocesses,youwillgettheactualamountofmemorybeingusedbythoseprocesses.Inotherwords,Pssisthenumberwehavebeenlookingfor.
InformationaboutPssisavailablein/proc/<PID>/smaps,whichcontainsadditionalinformationforeachofthemappingsshownin/proc/<PID>/maps.Hereisasectionfromsuchafilethatprovidesinformationonthemappingforthelibccodesegment:
b6e6d000-b6f45000r-xp00000000b3:022444/lib/libc-2.13.so
Size:864kB
Rss:264kB
Pss:6kB
Shared_Clean:264kB
Shared_Dirty:0kB
Private_Clean:0kB
Private_Dirty:0kB
Referenced:264kB
Anonymous:0kB
AnonHugePages:0kB
Swap:0kB
KernelPageSize:4kB
MMUPageSize:4kB
Locked:0kB
VmFlags:rdexmrmwme
NotethattheRssis264KiB,butbecauseitissharedbetweenmanyotherprocesses,thePssisonly6KiB.
Thereisatoolnamedsmemthatcollatesinformationfromthesmapsfilesandpresentsitinvariousways,includingaspieorbarcharts.Theprojectpageforsmemishttps://www.selenic.com/smem.Itisavailableasapackageinmostdesktopdistributions.However,sinceitiswritteninPython,installingitonanembeddedtargetrequiresaPythonenvironment,whichmaybetoomuchtroubleforjustonetool.Tohelpwiththis,thereisasmallprogramnamedsmemcapthatcapturesthestatefrom/proconthetargetandsavesittoaTARfilethatcanbeanalyzedlateronthehostcomputer.ItispartofBusyBox,butitcanalsobecompiledfromthesmemsource.
Runningsmemnatively,asroot,youwillseetheseresults:
#smem-t
PIDUserCommandSwapUSSPSSRSS
6100/sbin/agetty-sttyO0110128149720
12360/sbin/agetty-sttyGS010128149720
6090/sbin/agettytty1384000144163724
5780/usr/sbin/acpid0140173680
8190/usr/sbin/cron0188201704
634103avahi-daemon:chroothel0112205500
9800/usr/sbin/udhcpd-S/etc0196205568
...
8360/usr/bin/X:0-auth/var0717277469212
5830/usr/bin/nodeautorun.js08772904310076
10891000/usr/bin/python-O/usr/096001126416388
------------------------------------------------------------------
53606582078251146544
Youcanseefromthelastlineoftheoutputthatinthiscase,thetotalPssisaboutahalfoftheRss.
Ifyoudon'thaveordon'twanttoinstallPythononyourtarget,youcancapturethestateusingsmemcap,againasroot:
#smemcap>smem-bbb-cap.tar
Then,copytheTARfiletothehostandreaditusingsmem-S,thoughthistime,thereisnoneedtorunasroot:
$smem-t-Ssmem-bbb-cap.tar
Theoutputisidenticaltotheoutputwegetwhenrunningsmemnatively.
OthertoolstoconsiderAnotherwaytodisplayPssisviaps_mem(https://github.com/pixelb/ps_mem),whichprintsmuchthesameinformationbutinasimplerformat.ItisalsowritteninPython.
AndroidalsohasatoolthatdisplaysasummaryofUssandPssforeachprocess,namedprocrank,whichcanbecross-compiledforembeddedLinuxwithafewsmallchanges.Youcangetthecodefromhttps://github.com/csimmonds/procrank_linux.
IdentifyingmemoryleaksAmemoryleakoccurswhenmemoryisallocatedbutnotfreedwhenitisnolongerneeded.Memoryleakageisbynomeansuniquetoembeddedsystems,butitbecomesanissuepartlybecausetargetsdon'thavemuchmemoryinthefirstplaceandpartlybecausetheyoftenrunforlongperiodsoftimewithoutrebooting,allowingtheleakstobecomealargepuddle.
Youwillrealizethatthereisaleakwhenyourunfreeortopandseethatfreememoryiscontinuallygoingdownevenifyoudropcaches,asshownintheprecedingsection.Youwillbeabletoidentifytheculprit(orculprits)bylookingattheUssandRssperprocess.
Thereareseveraltoolstoidentifymemoryleaksinaprogram.Iwilllookattwo:mtraceandValgrind.
mtracemtraceisacomponentofglibcthattracescallstomalloc,free,andrelatedfunctions,andidentifiesareasofmemorynotfreedwhentheprogramexits.Youneedtocallthemtrace()functionfromwithintheprogramtobegintracingandthenatruntime,writeapathnametotheMALLOC_TRACEenvironmentvariableinwhichthetraceinformationiswritten.IfMALLOC_TRACEdoesnotexistorifthefilecannotbeopened,themtracehooksarenotinstalled.WhilethetraceinformationiswritteninASCII,itisusualtousethemtracecommandtoviewit.
Hereisanexample:
#include<mcheck.h>
#include<stdlib.h>
#include<stdio.h>
intmain(intargc,char*argv[])
{
intj;
mtrace();
for(j=0;j<2;j++)
malloc(100);/*Neverfreed:amemoryleak*/
calloc(16,16);/*Neverfreed:amemoryleak*/
exit(EXIT_SUCCESS);
}
Hereiswhatyoumightseewhenrunningtheprogramandlookingatthetrace:
$exportMALLOC_TRACE=mtrace.log
$./mtrace-example
$mtracemtrace-examplemtrace.log
Memorynotfreed:
-----------------
AddressSizeCaller
0x00000000014794600x64at/home/chris/mtrace-example.c:11
0x00000000014794d00x64at/home/chris/mtrace-example.c:11
0x00000000014795400x100at/home/chris/mtrace-example.c:15
Unfortunately,mtracedoesnottellyouaboutleakedmemorywhiletheprogramruns.Ithastoterminatefirst.
ValgrindValgrindisaverypowerfultoolusedtodiscovermemoryproblemsincludingleaksandotherthings.Oneadvantageisthatyoudon'thavetorecompiletheprogramsandlibrariesthatyouwanttocheck,althoughitworksbetteriftheyhavebeencompiledwiththe-goptionsothattheyincludedebugsymboltables.Itworksbyrunningtheprograminanemulatedenvironmentandtrappingexecutionatvariouspoints.ThisleadstothebigdownsideofValgrind,whichisthattheprogramrunsatafractionofnormalspeed,whichmakesitlessusefulintestinganythingwithreal-timeconstraints.
Incidentally,thenameisoftenmispronounced:itsaysintheValgrindFAQthatthegrindpartispronouncedwithashorti,asingrinned(rhymeswithtinned)ratherthangrind(rhymeswithfind).TheFAQ,documentation,anddownloadsareavailableathttp://valgrind.org.
Valgrindcontainsseveraldiagnostictools:
memcheck:Thisisthedefaulttool,anditdetectsmemoryleaksandgeneralmisuseofmemorycachegrind:Thiscalculatestheprocessorcachehitratecallgrind:Thiscalculatesthecostofeachfunctioncallhelgrind:ThishighlightsthemisuseofthePthreadAPI,includingpotentialdeadlocks,andraceconditionsDRD:ThisisanotherPthreadanalysistoolmassif:Thisprofilestheusageoftheheapandstack
Youcanselectthetoolyouwantwiththe-tooloption.Valgrindrunsonthemajorembeddedplatforms:ARM(cortexA),PPC,MIPS,andx86in32-and64-bitvariants.ItisavailableasapackageinboththeYoctoProjectandBuildroot.
Tofindourmemoryleak,weneedtousethedefaultmemchecktool,withthe--leakcheck=fulloptiontoprintthelineswheretheleakwasfound:
$valgrind--leak-check=full./mtrace-example
==17235==Memcheck,amemoryerrordetector
==17235==Copyright(C)2002-2013,andGNUGPL'd,byJulianSeward
etal.==17235==UsingValgrind-3.10.0.SVNandLibVEX;rerunwith-hfor
copyrightinfo
==17235==Command:./mtrace-example
==17235==
==17235==
==17235==HEAPSUMMARY:
==17235==inuseatexit:456bytesin3blocks
==17235==totalheapusage:3allocs,0frees,456bytes
allocated
==17235==
==17235==200bytesin2blocksaredefinitelylostinlossrecord
1of2==17235==at0x4C2AB80:malloc(in
/usr/lib/valgrind/vgpreload_memcheck-linux.so)
==17235==by0x4005FA:main(mtrace-example.c:12)
==17235==
==17235==256bytesin1blocksaredefinitelylostinlossrecord
2of2==17235==at0x4C2CC70:calloc(in
/usr/lib/valgrind/vgpreload_memcheck-linux.so)
==17235==by0x400613:main(mtrace-example.c:14)
==17235==
==17235==LEAKSUMMARY:
==17235==definitelylost:456bytesin3blocks
==17235==indirectlylost:0bytesin0blocks
==17235==possiblylost:0bytesin0blocks
==17235==stillreachable:0bytesin0blocks
==17235==suppressed:0bytesin0blocks
==17235==
==17235==Forcountsofdetectedandsuppressederrors,rerun
with:-v==17235==ERRORSUMMARY:2errorsfrom2contexts(suppressed:0
from0)
RunningoutofmemoryThestandardmemoryallocationpolicyistoover-commit,whichmeansthatthekernelwillallowmorememorytobeallocatedbyapplicationsthanthereisphysicalmemory.Mostofthetime,thisworksfinebecauseitiscommonforapplicationstorequestmorememorythantheyreallyneed.Thisalsohelpsintheimplementationoffork(2):itissafetomakeacopyofalargeprogrambecausethepagesofmemoryaresharedwiththecopyonwriteflagset.Inthemajorityofcases,forkisfollowedbyanexecfunctioncall,whichunsharesthememoryandthenloadsanewprogram.
However,thereisalwaysthepossibilitythataparticularworkloadwillcauseagroupofprocessestotrytocashinontheallocationstheyhavebeenpromisedsimultaneouslyandsodemandmorethantherereallyis.Thisisanoutofmemorysituation,orOOM.Atthispoint,thereisnootheralternativebuttokilloffprocessesuntiltheproblemgoesaway.Thisisthejoboftheoutofmemorykiller.
Beforewegettothat,thereisatuningparameterforkernelallocationsin/proc/sys/vm/overcommit_memory,whichyoucansettothefollowing:
0:Heuristicover-commit1:Alwaysover-commit;nevercheck2:Alwayscheck;neverover-commit
Option0isthedefaultandisthebestchoiceinthemajorityofcases.
Option1isonlyreallyusefulifyourunprogramsthatworkwithlargesparsearraysandallocatelargeareasofmemorybutwritetoasmallproportionofthem.Suchprogramsarerareinthecontextofembeddedsystems.
Option2,neverover-commit,seemstobeagoodchoiceifyouareworriedaboutrunningoutofmemory,perhapsinamissionorsafety-criticalapplication.Itwillfailallocationsthataregreaterthanthecommitlimit,whichisthesizeofswapspaceplusthetotalmemorymultipliedbytheover-commitratio.Theover-commitratioiscontrolledby/proc/sys/vm/overcommit_ratioandhasadefaultvalue
of50%.
Asanexample,supposeyouhaveadevicewith512MBofsystemRAMandyousetareallyconservativeratioof25%:
#echo25>/proc/sys/vm/overcommit_ratio
#grep-eMemTotal-eCommitLimit/proc/meminfo
MemTotal:509016kB
CommitLimit:127252kB
Thereisnoswap,sothecommitlimitis25%ofMemTotal,asexpected.
Thereisanotherimportantvariablein/proc/meminfo:Committed_AS.Thisisthetotalamountofmemorythatisneededtofulfillalltheallocationsmadesofar.Ifoundthefollowingononesystem:
#grep-eMemTotal-eCommitted_AS/proc/meminfo
MemTotal:509016kB
Committed_AS:741364kB
Inotherwords,thekernelhadalreadypromisedmorememorythantheavailablememory.Consequently,settingovercommit_memoryto2wouldmeanthatallallocationswouldfailregardlessofovercommit_ratio.Togettoaworkingsystem,IwouldhavetoeitherinstalldoubletheamountofRAMorseverelyreducethenumberofrunningprocesses,ofwhichtherewereabout40.
Inallcases,thefinaldefenseisoom-killer.Itusesaheuristicmethodtocalculateabadnessscorebetween0and1,000foreachprocess,andthenterminatesthosewiththehighestscoreuntilthereisenoughfreememory.Youshouldseesomethinglikethisinthekernellog:
[44510.490320]eatmeminvokedoom-killer:gfp_mask=0x200da,
order=0,oom_score_adj=0
...
YoucanforceanOOMeventusingechof>/proc/sysrq-trigger.
Youcaninfluencethebadnessscoreforaprocessbywritinganadjustmentvalueto/proc/<PID>/oom_score_adj.Avalueof-1000meansthatthebadnessscorecanneverbegreaterthanzeroandsoitwillneverbekilled;avalueof+1000meansthatitwillalwaysbegreaterthan1000andsoitwillalwaysbekilled.
FurtherreadingThefollowingresourceshavefurtherinformationonthetopicsintroducedinthischapter:
LinuxKernelDevelopment,3rdEdition,byRobertLove,AddisonWesley,O'ReillyMedia;(June,2010)ISBN-10:0672329468LinuxSystemProgramming,2ndEdition,byRobertLove,O'ReillyMedia;(8June,2013)ISBN-10:1449339530UnderstandingtheLinuxVMManagerbyMelGorman:https://www.kernel.org/doc/gorman/pdf/understand.pdfValgrind3.3-AdvancedDebuggingandProfilingforGnu/LinuxApplicationsbyJSeward,N.Nethercote,andJ.Weidendorfer,NetworkTheoryLtd;(1Mar,2008)ISBN978-0954612054
SummaryAccountingforeverybyteofmemoryusedinavirtualmemorysystemisjustnotpossible.However,youcanfindafairlyaccuratefigureforthetotalamountoffreememory,excludingthattakenbybuffersandcache,usingthefreecommand.Bymonitoringitoveraperiodoftimeandwithdifferentworkloads,youshouldbecomeconfidentthatitwillremainwithinagivenlimit.
Whenyouwanttotunememoryusageoridentifysourcesofunexpectedallocations,thereareresourcesthatgivemoredetailedinformation.Forkernelspace,themostusefulinformationisin/proc:meminfo,slabinfo,andvmallocinfo.
Whenitcomestogettingaccuratemeasurementsforuserspace,thebestmetricisPss,asshownbysmemandothertools.Formemorydebugging,youcangethelpfromsimpletracerssuchasmtrace,oryouhavetheheavyweightoptionoftheValgrindmemchecktool.
Ifyouhaveconcernsabouttheconsequenceofanoutofmemorysituation,youcanfine-tunetheallocationmechanismvia/proc/sys/vm/overcommit_memoryandyoucancontrolthelikelihoodofparticularprocessesbeingkilledthoughtheoom_score_adjparameter.
ThenextchapterisallaboutdebugginguserspaceandkernelcodeusingtheGNUdebuggerandtheinsightsyoucangainfromwatchingcodeasitruns,includingthememorymanagementfunctionsIhavedescribedhere.
DebuggingwithGDBBugshappen.Identifyingandfixingthemispartofthedevelopmentprocess.Therearemanydifferenttechniquesforfindingandcharacterizingprogramdefects,includingstaticanddynamicanalysis,codereview,tracing,profiling,andinteractivedebugging.Iwilllookattracersandprofilersinthenextchapter,buthereIwanttoconcentrateonthetraditionalapproachofwatchingcodeexecutionthroughadebugger,whichinourcaseistheGNUProjectDebugger(GDB).GDBisapowerfulandflexibletool.Youcanuseittodebugapplications,examinethepostmortemfiles(corefiles)thatarecreatedafteraprogramcrash,andevenstepthroughkernelcode.
Inthischapter,wewillcoverthefollowingtopics:
TheGNUdebuggerPreparingtodebugDebuggingapplicationsJust-in-timedebuggingDebuggingforksandthreadsCorefilesGDBuserinterfacesDebuggingkernelcode
TheGNUdebuggerGDBisasource-leveldebuggerforcompiledlanguages,primarilyCandC++,althoughthereisalsosupportforavarietyofotherlanguagessuchasGoandObjective-C.YoushouldreadthenotesfortheversionofGDByouareusingtofindoutthecurrentstatusofsupportforthevariouslanguages.
Theprojectwebsiteishttp://www.gnu.org/software/gdbanditcontainsalotofusefulinformation,includingtheGDBUserManual,DebuggingwithGDB.
Outofthebox,GDBhasacommand-lineuserinterface,whichsomepeoplefindoff-putting,althoughinreality,itiseasytousewithalittlepractice.Ifcommand-lineinterfacesarenottoyourliking,thereareplentyoffront-enduserinterfacestoGDB,andIwilldescribethreeofthemlater.
PreparingtodebugYouneedtocompilethecodeyouwanttodebugwithdebugsymbols.GCCofferstwooptionsforthis:-gand-ggdb.ThelatteraddsdebuginformationthatisspecifictoGDB,whereastheformergeneratesinformationinanappropriateformatforwhichevertargetoperatingsystemyouareusing,makingitthemoreportableoption.Inourparticularcase,thetargetoperatingsystemisalwaysLinux,anditmakeslittledifferencewhetheryouuse-gor-ggdb.Ofmoreinterestisthefactthatbothoptionsallowyoutospecifythelevelofdebuginformation,from0to3:
0:Thisproducesnodebuginformationatallandisequivalenttoomittingthe-gor-ggdbswitch1:Thisproducesminimalinformation,butwhichincludesfunctionnamesandexternalvariables,whichisenoughtogenerateabacktrace2:Thisisthedefaultandincludesinformationaboutlocalvariablesandlinenumberssothatyoucanperformsource-leveldebuggingandsingle-stepthroughthecode3:Thisincludesextrainformationwhich,amongotherthings,meansthatGDBcanhandlemacroexpansionscorrectly
Inmostcases,-gsuffices:reserve-g3or-ggdb3forifyouarehavingproblemssteppingthroughcode,especiallyifitcontainsmacros.
Thenextissuetoconsideristhelevelofcodeoptimization.Compileroptimizationtendstodestroytherelationshipbetweenlinesofsourcecodeandmachinecode,whichmakessteppingthroughthesourceunpredictable.Ifyouexperienceproblemslikethis,youwillmostlikelyneedtocompilewithoutoptimization,leavingoutthe-Ocompileswitch,orusing-Og,whichenablesoptimizationsthatdonotinterferewithdebugging.
Arelatedissueisthatofstack-framepointers,whichareneededbyGDBtogenerateabacktraceoffunctioncallsuptothecurrentone.Onsomearchitectures,GCCwillnotgeneratestack-framepointerswiththehigherlevelsofoptimization(-O2andabove).Ifyoufindyourselfinthesituationthatyou
reallyhavetocompilewith-O2butstillwantbacktraces,youcanoverridethedefaultbehaviorwith-fno-omit-frame-pointer.Alsolookoutforcodethathasbeenhand-optimizedtoleaveoutframepointersthroughtheadditionof-fomit-frame-pointer:youmaywanttotemporarilyremovethosebits.
DebuggingapplicationsYoucanuseGDBtodebugapplicationsinoneoftwoways:ifyouaredevelopingcodetorunondesktopsandservers,orindeedanyenvironmentwhereyoucompileandrunthecodeonthesamemachine,itisnaturaltorunGDBnatively.However,mostembeddeddevelopmentisdoneusingacrosstoolchain,andhenceyouwanttodebugcoderunningonthedevicebutcontrolitfromthecross-developmentenvironment,whereyouhavethesourcecodeandthetools.Iwillfocusonthelattercasesinceitisthemostlikelyscenarioforembeddeddevelopers,butIwillalsoshowyouhowtosetupasystemfornativedebugging.IamnotgoingtodescribethebasicsofusingGDBheresincetherearemanygoodreferencesonthattopicalready,includingtheGDBusermanualandthesuggestedFurtherreadingattheendofthechapter.
RemotedebuggingusinggdbserverThekeycomponentforremotedebuggingisthedebugagent,gdbserver,whichrunsonthetargetandcontrolsexecutionoftheprogrambeingdebugged.gdbserverconnectstoacopyofGDBrunningonthehostmachineviaanetworkconnectionoraserialinterface.
Debuggingthroughgdbserverisalmost,butnotquite,thesameasdebuggingnatively.Thedifferencesaremostlycenteredaroundthefactthattherearetwocomputersinvolvedandtheyhavetobeintherightstatefordebuggingtotakeplace.Herearesomethingstolookoutfor:
Atthestartofadebugsession,youneedtoloadtheprogramyouwanttodebugonthetargetusinggdbserver,andthenseparatelyloadGDBfromyourcrosstoolchainonthehost.GDBandgdbserverneedtoconnecttoeachotherbeforeadebugsessioncanbegin.GDB,runningonthehost,needstobetoldwheretolookfordebugsymbolsandsourcecode,especiallyforsharedlibraries.TheGDBruncommanddoesnotworkasexpected.gdbserverwillterminatewhenthedebugsessionends,andyouwillneedtorestartitifyouwantanotherdebugsession.Youneeddebugsymbolsandsourcecodeforthebinariesyouwanttodebugonthehost,butnotonthetarget.Often,thereisnotenoughstoragespaceforthemonthetarget,andtheywillneedtobestrippedbeforedeployingtothetarget.TheGDB/gdbservercombinationdoesnotsupportallthefeaturesofnativelyrunningGDB:forexample,gdbservercannotfollowthechildprocessafterafork,whereasnativeGDBcan.OddthingscanhappenifGDBandgdbserverareofdifferentversions,orarethesameversionbutconfigureddifferently.Ideally,theyshouldbebuiltfromthesamesourceusingyourfavoritebuildtool.
Debugsymbolsincreasethesizeofexecutablesdramatically,sometimesbyafactorof10.AsmentionedinChapter5,BuildingaRootFilesystem,itcanbe
usefultoremovedebugsymbolswithoutrecompilingeverything.Thetoolforthejobisstripfromyourcrosstoolchain.Youcancontroltheaggressivenessofstripwiththeseswitches:
--strip-all:Thisremovesallsymbols(default)--strip-unneeded:Thisremovessymbolsnotneededforrelocationprocessing--strip-debug:Thisremovesonlydebugsymbols
Forapplicationsandsharedlibraries,--strip-all(thedefault)isfine,butwhenitcomestokernelmodules,youwillfindthatitwillstopthemodulefromloading.Use--strip-unneededinstead.Iamstillworkingonausecasefor–strip-debug.
Withthatinmind,let'slookatthespecificsinvolvedindebuggingwiththeYoctoProjectandBuildroot.
SettinguptheYoctoProjectforremotedebuggingTherearetwothingstobedonetodebugapplicationsremotelywhenusingtheYoctoProject:youneedtoaddgdbservertothetargetimage,andyouneedtocreateanSDKthatincludesGDBandhasdebugsymbolsfortheexecutablesthatyouplantodebug.
First,then,toincludegdbserverinthetargetimage,youcanaddthepackageexplicitlybyaddingthistoconf/local.conf:
IMAGE_INSTALL_append="gdbserver"
Alternatively,youcanaddtools-debugtoEXTRA_IMAGE_FEATURES,whichwilladdgdbserver,nativegdb,andstracetothetargetimage(Iwilltalkaboutstraceinthenextchapter):
EXTRA_IMAGE_FEATURES="debug-tweakstools-debug"
Forthesecondpart,youjustneedtobuildanSDKasIdescribedinChapter6,SelectingaBuildSystem:
$bitbake-cpopulate_sdk<image>
TheSDKcontainsacopyofGDB.Italsocontainsasysrootforthetargetwithdebugsymbolsforalltheprogramsandlibrariesthatarepartofthetargetimage.Finally,theSDKcontainsthesourcecodefortheexecutables.Forexample,lookingatanSDKbuiltfortheBeagleBoneBlackandgeneratedbyversion2.2.1oftheYoctoProject,itisinstalledbydefaultinto/opt/poky/2.2.1/.Thesysrootforthetargetis/opt/poky/2.2.1/sysroots/cortexa8hf-neon-poky-linux-gnueabi/.Theprogramsarein/bin/,/sbin/,/usr/bin/and/usr/sbin/,relativetothesysroot,andthelibrariesarein/lib/and/usr/lib/.Ineachofthesedirectories,youwillfindasubdirectorynamed.debug/thatcontainsthesymbolsforeachprogramandlibrary.GDBknowstolookin.debug/whensearchingforsymbolinformation.Thesourcecodefortheexecutablesisstoredin/usr/src/debug/,relativetothesysroot.
SettingupBuildrootforremotedebuggingBuildrootdoesnotmakeadistinctionbetweenthebuildenvironmentandthatusedforapplicationdevelopment:thereisnoSDK.AssumingthatyouareusingtheBuildrootinternaltoolchain,youneedtoenabletheseoptionstobuildthecrossGDBforthehostandtobuildgdbserverforthetarget:
BR2_PACKAGE_HOST_GDB,inToolchain|BuildcrossgdbforthehostBR2_PACKAGE_GDB,inTargetpackages|Debugging,profilingandbenchmark->gdbBR2_PACKAGE_GDB_SERVER,inTargetpackages|Debugging,profilingandbenchmark|gdbserver
Youalsoneedtobuildexecutableswithdebugsymbols,forwhichyouneedtoenableBR2_ENABLE_DEBUG,inBuildoptions|buildpackageswithdebuggingsymbols.
Thiswillcreatelibrarieswithdebugsymbolsinoutput/host/usr/<arch>/sysroot.
StartingtodebugNowthatyouhavegdbserverinstalledonthetargetandacrossGDBonthehost,youcanstartadebugsession.
ConnectingGDBandgdbserverTheconnectionbetweenGDBandgdbservercanbethroughanetworkorserialinterface.Inthecaseofanetworkconnection,youlaunchgdbserverwiththeTCPportnumbertolistenonand,optionally,anIPaddresstoacceptconnectionsfrom.Inmostcases,youdon'tcarewhichIPaddressisgoingtoconnect,soyoucanjustprovidetheportnumber.Inthisexample,gdbserverwaitsforaconnectiononport10000fromanyhost:
#gdbserver:10000./hello-world
Processhello-worldcreated;pid=103
Listeningonport10000
Next,startthecopyofGDBfromyourtoolchain,pointingitatanunstrippedcopyoftheprogramsothatGDBcanloadthesymboltable:
$arm-poky-linux-gnueabi-gdbhello-world
InGDB,usethecommandtargetremotetomaketheconnectiontogdbserver,givingittheIPaddressorhostnameofthetargetandtheportitiswaitingon:
(gdb)targetremote192.168.1.101:10000
Whengdbserverseestheconnectionfromthehost,itprintsthefollowing:
Remotedebuggingfromhost192.168.1.1
Theprocedureissimilarforaserialconnection.Onthetarget,youtellgdbserverwhichserialporttouse:
#gdbserver/dev/ttyO0./hello-world
Youmayneedtoconfiguretheportbaudratebeforehandusingstty(1)orasimilarprogram.Asimpleexamplewouldbeasfollows:
#stty-F/dev/ttyO0115200
Therearemanyotheroptionstostty,soreadthemanualpageformoredetails.Itisworthwhilenotingthattheportmustnotbebeingusedforanythingelse.Forexample,youcan'tuseaportthatisbeingusedasthesystemconsole.
Onthehost,youmaketheconnectiontogdbserverusingtargetremoteplustheserialdeviceatthehostendofthecable.Inmostcases,youwillwanttosetthebaudrateofthehostserialportfirst,usingtheGDBcommandsetserialbaud:
(gdb)setserialbaud115200
(gdb)targetremote/dev/ttyUSB0
SettingthesysrootGDBneedstoknowwheretofinddebuginformationandsourcecodefortheprogramandsharedlibrariesyouaredebugging.Whendebuggingnatively,thepathsarewellknownandbuiltintoGDB,butwhenusingacrosstoolchain,GDBhasnowaytoguesswheretherootofthetargetfilesystemis.Youhavetogiveitthisinformation.
IfyoubuiltyourapplicationusingtheYoctoProjectSDK,thesysrootiswithintheSDK,andsoyoucansetitinGDBlikethis:
(gdb)setsysroot/opt/poky/2.2.1/sysroots/cortexa8hf-neon-poky-linux-gnueabi
IfyouareusingBuildroot,youwillfindthatthesysrootisinoutput/host/usr/<toolchain>/sysroot,andthatthereisasymboliclinktoitinoutput/staging.So,forBuildroot,youwouldsetthesysrootlikethis:
(gdb)setsysroot/home/chris/buildroot/output/staging
GDBalsoneedstofindthesourcecodeforthefilesyouaredebugging.GDBhasasearchpathforsourcefiles,whichyoucanseeusingthecommandshowdirectories:
(gdb)showdirectories
Sourcedirectoriessearched:$cdir:$cwd
Thesearethedefaults:$cwdisthecurrentworkingdirectoryoftheGDBinstancerunningonthehost;$cdiristhedirectorywherethesourcewascompiled.ThelatterisencodedintotheobjectfileswiththetagDW_AT_comp_dir.Youcanseethesetagsusingobjdump--dwarf,likethis,forexample:
$arm-poky-linux-gnueabi-objdump--dwarf./helloworld|grepDW_AT_comp_dir
[...]
<160>DW_AT_comp_dir:(indirectstring,offset:0x244):/home/chris/helloworld
[...]
Inmostcases,thedefaults,$cdir:$cwd,aresufficient,butproblemsariseifthedirectorieshavebeenmovedbetweencompilationanddebugging.OnesuchcaseoccurswiththeYoctoProject.TakingadeeperlookattheDW_AT_comp_dirtagsforaprogramcompiledusingtheYoctoProjectSDK,youmaynoticethis:
$arm-poky-linux-gnueabi-objdump--dwarf./helloworld|grepDW_AT_comp_dir
<2f>DW_AT_comp_dir:/usr/src/debug/glibc/2.24-r0/git/csu
<79>DW_AT_comp_dir:(indirectstring,offset:0x139):/usr/src/debug/glibc/2.24-r0/git/csu
<116>DW_AT_comp_dir:/usr/src/debug/glibc/2.24-r0/git/csu
<160>DW_AT_comp_dir:(indirectstring,offset:0x244):/home/chris/helloworld
[...]
Here,youcanseemultiplereferencestothedirectory/usr/src/debug/glibc/2.24-r0/git,butwhereisit?TheansweristhatitisinthesysrootfortheSDK,sothefullpathis/opt/poky/2.2.1/sysroots/cortexa8hf-neon-poky-linux-gnueabi/usr/src/debug/glibc/2.24-r0/git.TheSDKcontainssourcecodeforalloftheprogramsandlibrariesthatareinthetargetimage.GDBhasasimplewaytocopewithanentiredirectorytreebeingmovedlikethis:substitute-path.So,whendebuggingwiththeYoctoProjectSDK,youneedtousethesecommands:
(gdb)setsysroot/opt/poky/2.2.1/sysroots/cortexa8hf-neon-poky-linux-gnueabi
(gdb)setsubstitute-path/usr/src/debug/opt/poky/2.2.1/sysroots/cortexa8hf-neon-poky-linux-gnueabi/usr/src/debug
Youmayhaveadditionalsharedlibrariesthatarestoredoutsidethesysroot.Inthatcase,youcanusesetsolib-search-path,whichcancontainacolon-separatedlistofdirectoriestosearchforsharedlibraries.GDBsearchessolib-search-pathonlyifitcannotfindthebinaryinthesysroot.
AthirdwayoftellingGDBwheretolookforsourcecode,forbothlibrariesandprograms,isusingthedirectorycommand:
(gdb)directory/home/chris/MELP/src/lib_mylib
Sourcedirectoriessearched:/home/chris/MELP/src/lib_mylib:$cdir:$cwd
Pathsaddedinthiswaytakeprecedencebecausetheyaresearchedbeforethosefromsysrootorsolib-search-path.
GDBcommandfilesTherearesomethingsthatyouneedtodoeachtimeyourunGDB,forexample,settingthesysroot.ItisconvenienttoputsuchcommandsintoacommandfileandrunthemeachtimeGDBisstarted.GDBreadscommandsfrom$HOME/.gdbinit,thenfrom.gdbinitinthecurrentdirectory,andthenfromfilesspecifiedonthecommandlinewiththe-xparameter.However,recentversionsofGDBwillrefusetoload.gdbinitfromthecurrentdirectoryforsecurityreasons.Youcanoverridethatbehaviorforbyaddingalinelikethistoyour$HOME/.gdbinit:
setauto-loadsafe-path/
Alternatively,ifyoudon'twanttoenableauto-loadingglobally,youcanspecifyaparticulardirectorylikethis:
add-auto-load-safe-path/home/chris/myprog
Mypersonalpreferenceistousethe-xparametertopointtothecommandfile,whichexposesthelocationofthefilesothatIdon'tforgetaboutit.
TohelpyousetupGDB,BuildrootcreatesaGDBcommandfilecontainingthecorrectsysrootcommandinoutput/staging/usr/share/buildroot/gdbinit.Itwillcontainalinesimilartothisone:
setsysroot/home/chris/buildroot/output/host/usr/arm-buildroot-linux-gnueabi/sysroot
OverviewofGDBcommandsGDBhasagreatmanycommands,whicharedescribedintheonlinemanualandintheresourcesmentionedintheFurtherreadingsection.Tohelpyougetgoingasquicklyaspossible,hereisalistofthemostcommonlyusedcommands.Inmostcasesthere,isashortformforthecommand,whichislistedinthetablesfollowing.
BreakpointsThesearethecommandsformanagingbreakpoints:
CommandShort-formcommand
Use
break
<location>b<location>
Setabreakpointonafunctionname,linenumber,orline.Examplesoflocationsaremain,5,andsortbug.c:42.
info
breakpointsib Listbreakpoints.
delete
breakpoint
<N>
db<N> Deletebreakpoint<N>.
RunningandsteppingThesearecommandsforcontrollingtheexecutionofaprogram:
CommandShort-formcommand
Use
run r
Loadafreshcopyoftheprogramintomemoryandstartrunningit.Thisdoesnotworkforremotedebugusinggdbserver.
continue c Continueexecutionfromabreakpoint.Ctrl-C - Stoptheprogrambeingdebugged.
step sSteponelineofcode,steppingintoanyfunctionthatiscalled.
next nSteponelineofcode,steppingoverafunctioncall.
finish - Rununtilthecurrentfunctionreturns.
GettinginformationThesearecommandsforgettinginformationaboutthedebugger:
Command Short-formcommand Use
backtrace bt Listthecallstack
infothreads ithDisplayinformationaboutthethreadsexecutingintheprogram
info
sharedlibraryishare
Displayinformationaboutsharedlibrariescurrentlyloadedbytheprogram
<variable>p<variable> Printthevalueofavariable,forexampleprint
foo
list lListlinesofcodearoundthecurrentprogramcounter
RunningtoabreakpointGdbserverloadstheprogramintomemoryandsetsabreakpointatthefirstinstruction,thenwaitsforaconnectionfromGDB.Whentheconnectionismade,youenterintoadebugsession.However,youwillfindthatifyoutrytosingle-stepimmediately,youwillgetthismessage:
Cannotfindboundsofcurrentfunction
ThisisbecausetheprogramhasbeenhaltedincodewritteninassemblywhichcreatestheruntimeenvironmentforCandC++programs.ThefirstlineofCorC++codeisthemain()function.Supposingthatyouwanttostopatmain(),youwouldsetabreakpointthereandthenusethecontinuecommand(abbreviationc)totellgdbservertocontinuefromthebreakpointatthestartoftheprogramandstopatmain():
(gdb)breakmain
Breakpoint1,main(argc=1,argv=0xbefffe24)athelloworld.c:8printf("Hello,world!\n");
(gdb)c
Atthispoint,youmayseethefollowing:
Reading/lib/ld-linux.so.3fromremotetarget...
warning:Filetransfersfromremotetargetscanbeslow.Use"setsysroot"toaccessfileslocallyinstead.
WitholderversionsofGDB,youmayinsteadseethis:
warning:Couldnotloadsharedlibrarysymbolsfor2libraries,e.g./lib/libc.so.6.
Inbothcases,theproblemisthatyouhaveforgottentosetthesysroot!Takeanotherlookattheearliersectiononsysroot.
Thisisallverydifferenttostartingaprogramnatively,whereyoujusttyperun.Infact,ifyoutrytypingruninaremotedebugsession,youwilleitherseeamessagesayingthattheremotetargetdoesnotsupporttheruncommand,orinolderversionsofGDB,itwilljusthangwithoutanyexplanation.
NativedebuggingRunninganativecopyofGDBonthetargetisnotascommonasdoingitremotely,butitispossible.AswellasinstallingGDBinthetargetimage,youwillalsoneedunstrippedcopiesoftheexecutablesyouwanttodebugandthecorrespondingsourcecodeinstalledinthetargetimage.BoththeYoctoProjectandBuildrootallowyoutodothis.
Whilenativedebuggingisnotacommonactivityforembeddeddevelopers,runningprofileandtracetoolsonthetargetisverycommon.Thesetoolsusuallyworkbestifyouhaveunstrippedbinariesandsourcecodeonthetarget,whichishalfofthestoryIamtellinghere.Iwillreturntothetopicinthenextchapter.
TheYoctoProjectTobeginwith,youneedtoaddgdbtothetargetimagebyaddingthistoconf/local.conf:
IMAGE_INSTALL_append="gdb"
Next,youneedthedebuginformationforthepackagesyouwanttodebug.TheYoctoProjectbuildsdebugvariantsofpackages,whichcontainunstrippedbinariesandthesourcecode.Youcanaddthesedebugpackagesselectivelytoyourtargetimagebyadding<packagename>-dbgtoyourconf/local.conf.Alternatively,youcansimplyinstallalldebugpackagesbyaddingdbg-pkgstoEXTRA_IMAGE_FEATURES.Bewarnedthatthiswillincreasethesizeofthetargetimagedramatically,perhapsbyseveralhundredsofmegabytes.
EXTRA_IMAGE_FEATURES="dbg-pkgs"
Thesourcecodeisinstalledinto/usr/src/debug/<packagename>inthetargetimage.ThismeansthatGDBwillpickitupwithoutneedingtorunsetsubstitute-path.Ifyoudon'tneedthesource,youcanpreventitfrombeinginstalledbyaddingthistoyourconf/local.conffile:
PACKAGE_DEBUG_SPLIT_STYLE="debug-without-src"
BuildrootWithBuildroot,youcantellittoinstallanativecopyofGDBinthetargetimagebyenablingthisoption:
BR2_PACKAGE_GDB_DEBUGGERinTargetpackages|Debugging,profilingandbenchmark|Fulldebugger
Then,tobuildbinarieswithdebuginformationandtoinstalltheminthetargetimagewithoutstripping,enablethesetwooptions:
BR2_ENABLE_DEBUGinBuildoptions|BuildpackageswithdebuggingsymbolsBR2_STRIP_noneinBuildoptions|Stripcommandforbinariesontarget
Just-in-timedebuggingSometimesaprogramwillstarttomisbehaveafterithasbeenrunningforawhile,andyouwouldliketoknowwhatitisdoing.TheGDBattachfeaturedoesexactlythis.Icallitjust-in-timedebugging.Itisavailablewithbothnativeandremotedebugsessions.
Inthecaseofremotedebugging,youneedtofindthePIDoftheprocesstobedebuggedandpassittogdbserverwiththe--attachoption.Forexample,ifthePIDis109,youwouldtypethis:
#gdbserver--attach:10000109
Attached;pid=109
Listeningonport10000
Thisforcestheprocesstostopasifitwereatabreakpoint,allowingyoutostartyourcrossGDBinthenormalwayandconnecttogdbserver.Whenyouaredone,youcandetach,allowingtheprogramtocontinuerunningwithoutthedebugger:
(gdb)detach
Detachingfromprogram:/home/chris/MELP/helloworld/helloworld,process109
Endingremotedebugging.
DebuggingforksandthreadsWhathappenswhentheprogramyouaredebuggingforks?Doesthedebugsessionfollowtheparentprocessorthechild?Thisbehavioriscontrolledbyfollow-fork-mode,whichmaybeparentorchild,withparentbeingthedefault.Unfortunately,currentversionsofgdbserverdonotsupportthisoption,soitonlyworksfornativedebugging.Ifyoureallyneedtodebugthechildprocesswhileusinggdbserver,aworkaroundistomodifythecodesothatthechildloopsonavariableimmediatelyafterthefork,givingyoutheopportunitytoattachanewgdbserversessiontoitandthentosetthevariablesothatitdropsoutoftheloop.
Whenathreadinamulti-threadedprocesshitsabreakpoint,thedefaultbehaviorisforallthreadstohalt.Inmostcases,thisisthebestthingtodoasitallowsyoutolookatstaticvariableswithoutthembeingchangedbytheotherthreads.Whenyourecommenceexecutionofthethread,allthestoppedthreadsstartup,evenifyouaresingle-stepping,anditisespeciallythislastcasethatcancauseproblems.ThereisawaytomodifythewayinwhichGDBhandlesstoppedthreads,throughaparametercalledscheduler-locking.Normallyitisoff,butifyousetittoon,onlythethreadthatwasstoppedatthebreakpointisresumedandtheothersremainstopped,givingyouachancetoseewhatthethreadalonedoeswithoutinterference.Thiscontinuestobethecaseuntilyouturnscheduler-lockingoff.Gdbserversupportsthisfeature.
CorefilesCorefilescapturethestateofafailingprogramatthepointthatitterminates.Youdon'tevenhavetobeintheroomwithadebuggerwhenthebugmanifestsitself.So,whenyouseeSegmentationfault(coredumped),don'tshrug;investigatethecorefileandextractthegoldmineofinformationinthere.
Thefirstobservationisthatcorefilesarenotcreatedbydefault,butonlywhenthecorefileresourcelimitfortheprocessisnon-zero.Youcanchangeitforthecurrentshellusingulimit-c.Toremovealllimitsonthesizeofcorefiles,typethefollowingcommand:
$ulimit-cunlimited
Bydefault,thecorefileisnamedcoreandisplacedinthecurrentworkingdirectoryoftheprocess,whichistheonepointedtoby/proc/<PID>/cwd.Thereareanumberofproblemswiththisscheme.Firstly,whenlookingatadevicewithseveralfilesnamedcore,itisnotobviouswhichprogramgeneratedeachone.Secondly,thecurrentworkingdirectoryoftheprocessmaywellbeinaread-onlyfilesystem,theremaynotbeenoughspacetostorethecorefile,ortheprocessmaynothavepermissionstowritetothecurrentworkingdirectory.
Therearetwofilesthatcontrolthenamingandplacementofcorefiles.Thefirstis/proc/sys/kernel/core_uses_pid.Writinga1toitcausesthePIDnumberofthedyingprocesstobeappendedtothefilename,whichissomewhatusefulaslongasyoucanassociatethePIDnumberwithaprogramnamefromlogfiles.
Muchmoreusefulis/proc/sys/kernel/core_pattern,whichgivesyoualotmorecontrolovercorefiles.Thedefaultpatterniscore,butyoucanchangeittoapatterncomposedofthesemetacharacters:
%p:ThePID%u:TherealUIDofthedumpedprocess%g:TherealGIDofthedumpedprocess%s:Thenumberofthesignalcausingthedump%t:Thetimeofdump,expressedassecondssincetheEpoch,1970-01-01
00:00:00+0000(UTC)%h:Thehostname%e:Theexecutablefilename%E:Thepathnameoftheexecutable,withslashes(/)replacedbyexclamationmarks(!)%c:Thecorefilesizesoftresourcelimitofthedumpedprocess
Youcanalsouseapatternthatbeginswithanabsolutedirectorynamesothatallcorefilesaregatheredtogetherinoneplace.Asanexample,thefollowingpatternputsallcorefilesintothedirectory/corefilesandnamesthemwiththeprogramnameandthetimeofthecrash:
#echo/corefiles/core.%e.%t>/proc/sys/kernel/core_pattern
Followingacoredump,youwouldfindsomethinglikethis:
#ls/corefiles
core.sort-debug.1431425613
Formoreinformation,refertothemanualpagecore(5).
UsingGDBtolookatcorefilesHereisasampleGDBsessionlookingatacorefile:
$arm-poky-linux-gnueabi-gdbsort-debug/home/chris/rootfs/corefiles/core.sort-debug.1431425613
[...]
Corewasgeneratedby`./sort-debug'.
ProgramterminatedwithsignalSIGSEGV,Segmentationfault.
#00x000085c8inaddtree(p=0x0,w=0xbeac4c60"the")atsort-debug.c:41
41p->word=strdup(w);
Thatshowsthattheprogramstoppedatline41.Thelistcommandshowsthecodeinthevicinity:
(gdb)list
37staticstructtnode*addtree(structtnode*p,char*w)
38{
39intcond;
40
41p->word=strdup(w);
42p->count=1;
43p->left=NULL;
44p->right=NULL;
45
Thebacktracecommand(shortenedtobt)showshowwegottothispoint:
(gdb)bt
#00x000085c8inaddtree(p=0x0,w=0xbeac4c60"the")atsort-debug.c:41
#10x00008798inmain(argc=1,argv=0xbeac4e24)atsort-debug.c:89
Anobviousmistake:addtree()wascalledwithanullpointer.
GDBuserinterfacesGDBiscontrolledatalowlevelthroughtheGDBmachineinterface,GDB/MI,whichcanbeusedtowrapGDBinauserinterfaceoraspartofalargerprogram,anditconsiderablyextendstherangeofoptionsavailabletoyou.
Inthissection,Iwilldescribethreethatarewellsuitedtodebuggingembeddedtargets:theTerminaluserinterface,TUI;thedatadisplaydebugger,DDD;andtheEclipseC-developmentToolkit(CDT).
TerminaluserinterfaceTerminaluserinterface(TUI)isanoptionalpartofthestandardGDBpackage.Themainfeatureisacodewindowthatshowsthelineofcodeabouttobeexecuted,togetherwithanybreakpoints.Itisadefiniteimprovementonthelistcommandincommand-linemodeGDB.
TheattractionofTUIisthatitjustworkswithoutanyextrasetup,andsinceitisintextmodeitispossibletouseoveranSSHterminalsession,forexample,whenrunninggdbnativelyonatarget.MostcrosstoolchainsconfigureGDBwithTUI.Simplyadd-tuitothecommandlineandyouwillseethefollowing:
DatadisplaydebuggerDatadisplaydebugger(DDD)isasimplestandaloneprogramthatgivesyouagraphicaluserinterfacetoGDBwithminimalfussandbother,andalthoughtheUIcontrolslookdated,itdoeseverythingthatisnecessary.
The--debuggeroptiontellsDDDtouseGDBfromyourtoolchain,andyoucanusethe-xargumenttogivethepathtoaGDBcommandfile:
$ddd--debuggerarm-poky-linux-gnueabi-gdb-xgdbinitsort-debug
Thefollowingscreenshotshowsoffoneofthenicestfeatures:thedatawindow,whichcontainsitemsinagridthatyoucanrearrangeasyouwish.Ifyoudouble-clickonapointer,itisexpandedintoanewdataitemandthelinkisshownwithanarrow:
EclipseEclipse,withtheCDTplugin,supportsdebuggingwithGDB,includingremotedebugging.IfyouuseEclipseforallyourcodedevelopment,thisistheobvioustooltouse,butifyouarenotaregularEclipseuser,itisprobablynotworththeeffortofsettingitupjustforthistask.ItwouldtakemeawholechaptertoexplainadequatelyhowtoconfigureCDTtoworkwithacrosstoolchainandconnecttoaremotedevice,soIwillreferyoutothereferencesattheendofthechapterformoreinformation.ThescreenshotthatfollowsshowsthedebugperspectiveofCDT.Inthetop-rightwindow,youseethestackframesforeachofthethreadsintheprocess,andatthetoprightisthewatchwindow,showingvariables.Inthemiddleisthecodewindow,showingthelineofcodewherethedebuggerhasstoppedtheprogram:
DebuggingkernelcodeYoucanusekgdbforsource-leveldebugging,inamannersimilartoremotedebuggingwithgdbserver.Thereisalsoaself-hostedkerneldebugger,kdb,thatishandyforlighter-weighttaskssuchasseeingwhetheraninstructionisexecutedandgettingthebacktracetofindouthowitgotthere.Finally,therearekernelOopsmessagesandpanics,whichtellyoualotaboutthecauseofakernelexception.
DebuggingkernelcodewithkgdbWhenlookingatkernelcodeusingasourcedebugger,youmustrememberthatthekernelisacomplexsystem,withreal-timebehaviors.Don'texpectdebuggingtobeaseasyasitisforapplications.Steppingthroughcodethatchangesthememorymappingorswitchescontextislikelytoproduceoddresults.
kgdbisthenamegiventothekernelGDBstubsthathavebeenpartofmainlineLinuxformanyyearsnow.ThereisausermanualinthekernelDocBook,andyoucanfindanonlineversionathttps://www.kernel.org/doc/htmldocs/kgdb/index.html.
Inmostcases,youwillconnecttokgdbovertheserialinterface,whichisusuallysharedwiththeserialconsole.Hence,thisimplementationiscalledkgdboc,whichisshortforkgdboverconsole.Towork,itrequiresaplatformttydriverthatsupportsI/Opollinginsteadofinterrupts,sincekgdbhastodisableinterruptswhencommunicatingwithGDB.AfewplatformssupportkgdboverUSB,andtherehavebeenversionsthatworkoverEthernetbut,unfortunately,noneofthosehavefoundtheirwayintomainlineLinux.
Thesamecaveatsaboutoptimizationandstackframesapplytothekernel,withthelimitationthatthekerneliswrittentoassumeanoptimizationlevelofatleast-O1.YoucanoverridethekernelcompileflagsbysettingKCFLAGSbeforerunningmake.
These,then,arethekernelconfigurationoptionsyouwillneedforkerneldebugging:
CONFIG_DEBUG_INFOisinthemenuKernelhacking|Compile-timechecksandcompileroptions|CompilethekernelwithdebuginfoCONFIG_FRAME_POINTERmaybeanoptionforyourarchitecture,andisinthemenuKernelhacking|Compile-timechecksandcompileroptions|CompilethekernelwithframepointersCONFIG_KGDBisinthemenuKernelhacking|KGDB:kerneldebuggerCONFIG_KGDB_SERIAL_CONSOLEisinthemenuKernelhacking|KGDB:kerneldebugger|KGDB:usekgdbovertheserialconsole
InadditiontothezImageoruImagecompressedkernelimage,youwillneedthekernelimageinELFobjectformatsothatGDBcanloadthesymbolsintomemory.ThisisthefilecalledvmlinuxthatisgeneratedinthedirectorywhereLinuxisbuilt.InYocto,youcanrequestthatacopybeincludedinthetargetimageandSDK.Itisbuiltasapackagenamedkernel-vmlinux,whichyoucaninstalllikeanyother,forexample,byaddingittotheIMAGE_INSTALLlist.
Thefileisputintothesysrootbootdirectory,withanamesuchasthis:
/opt/poky/2.2.1/sysroots/cortexa8hf-neon-poky-linux-gnueabi/boot/vmlinux-4.8.12-yocto-standard
InBuildroot,youwillfindvmlinuxinthedirectorywherethekernelwasbuilt,whichisinoutput/build/linux-<versionstring>/vmlinux.
AsampledebugsessionThebestwaytoshowyouhowitworksiswithasimpleexample.
Youneedtotellkgdbwhichserialporttouse,eitherthroughthekernelcommandlineoratruntimeviasysfs.Forthefirstoption,addkgdboc=<tty>,<baudrate>tothecommandline,asshownhere:
kgdboc=ttyO0,115200
Forthesecondoption,bootthedeviceupandwritetheterminalnametothefile/sys/module/kgdboc/parameters/kgdboc,asshownhere:
#echottyO0>/sys/module/kgdboc/parameters/kgdboc
Notethatyoucannotsetthebaudrateinthisway.Ifitisthesamettyastheconsole,thenitissetalready.Ifnotuse,sttyorasimilarprogram.
NowyoucanstartGDBonthehost,selectingthevmlinuxfilethatmatchestherunningkernel:
$arm-poky-linux-gnueabi-gdb~/linux/vmlinux
GDBloadsthesymboltablefromvmlinuxandwaitsforfurtherinput.
Next,closeanyterminalemulatorthatisattachedtotheconsole:youareabouttouseitforGDB,andifbothareactiveatthesametime,someofthedebugstringsmightgetcorrupted.
Now,youcanreturntoGDBandattempttoconnecttokgdb.However,youwillfindthattheresponseyougetfromtargetremoteatthistimeisunhelpful:
(gdb)setserialbaud115200
(gdb)targetremote/dev/ttyUSB0
Remotedebuggingusing/dev/ttyUSB0
Bogustracestatusreplyfromtarget:qTStatus
Theproblemisthatkgdbisnotlisteningforaconnectionatthispoint.YouneedtointerruptthekernelbeforeyoucanenterintoaninteractiveGDBsessionwithit.Unfortunately,justtypingCtrl+CinGDB,asyouwouldwithanapplication,
doesnotwork.Youhavetoforceatrapintothekernelbylaunchinganothershellonthetarget,viaSSHforexample,andwritingagto/proc/sysrq-triggeronthetargetboard:
#echog>/proc/sysrq-trigger
Thetargetstopsdeadatthispoint.Nowyoucanconnecttokgdbviatheserialdeviceatthehostendofthecable:
(gdb)setserialbaud115200
(gdb)targetremote/dev/ttyUSB0
Remotedebuggingusing/dev/ttyUSB0
0xc009a59cinarch_kgdb_breakpoint()
Atlast,GDBisincharge.Youcansetbreakpoints,examinevariables,lookatbacktraces,andsoon.Asanexample,setabreakonsys_sync,asfollows:
(gdb)breaksys_sync
Breakpoint1at0xc0128a88:filefs/sync.c,line103.
(gdb)c
Continuing.
Nowthetargetcomesbacktolife.Typingsynconthetargetcallssys_syncandhitsthebreakpoint:
[NewThread87]
[SwitchingtoThread87]
Breakpoint1,sys_sync()atfs/sync.c:103
Ifyouhavefinishedthedebugsessionandwanttodisablekgdboc,justsetthekgdbocterminaltonull:
#echo"">/sys/module/kgdboc/parameters/kgdboc
DebuggingearlycodeTheprecedingexampleworksincaseswherethecodeyouareinterestedinisexecutedwhenthesystemisfullybooted.Ifyouneedtogetinearly,youcantellthekerneltowaitduringbootbyaddingkgdbwaittothecommandline,afterthekgdbocoption:
kgdboc=ttyO0,115200kgdbwait
Now,whenyouboot,youwillseethisontheconsole:
[1.103415]console[ttyO0]enabled
[1.108216]kgdb:RegisteredI/Odriverkgdboc.
[1.113071]kgdb:Waitingforconnectionfromremotegdb...
Atthispoint,youcanclosetheconsoleandconnectfromGDBintheusualway.
DebuggingmodulesDebuggingkernelmodulespresentsanadditionalchallengebecausethecodeisrelocatedatruntime,andsoyouneedtofindoutatwhataddressitresides.Theinformationispresentedthroughsysfs.Therelocationaddressesforeachsectionofthemodulearestoredin/sys/module/<modulename>/sections.NotethatsinceELFsectionsbeginwithadot(.),theyappearashiddenfiles,andyouwillhavetousels-aifyouwanttolistthem.Theimportantonesare.text,.data,and.bss.
Takeasanexampleamodulenamedmbx:
#cat/sys/module/mbx/sections/.text
0xbf000000
#cat/sys/module/mbx/sections/.data
0xbf0003e8
#cat/sys/module/mbx/sections/.bss
0xbf0005c0
NowyoucanusethesenumbersinGDBtoloadthesymboltableforthemoduleatthoseaddresses:
(gdb)add-symbol-file/home/chris/mbx-driver/mbx.ko0xbf000000\
-s.data0xbf0003e8-s.bss0xbf0005c0
addsymboltablefromfile"/home/chris/mbx-driver/mbx.ko"at
.text_addr=0xbf000000
.data_addr=0xbf0003e8
.bss_addr=0xbf0005c0
Everythingshouldnowworkasnormal:youcansetbreakpointsandinspectglobalandlocalvariablesinthemodulejustasyoucaninvmlinux:
(gdb)breakmbx_write
Breakpoint1at0xbf00009c:file/home/chris/mbx-driver/mbx.c,line93.
(gdb)c
Continuing.
Then,forcethedevicedrivertocallmbx_write,anditwillhitthebreakpoint:
Breakpoint1,mbx_write(file=0xde7a71c0,buffer=0xadf40"hello\n\n",
length=6,offset=0xde73df80)
at/home/chris/mbx-driver/mbx.c:93
DebuggingkernelcodewithkdbAlthoughkdbdoesnothavethefeaturesofkgdbandGDB,itdoeshaveitsuses,andbeingself-hosted,therearenoexternaldependenciestoworryabout.kdbhasasimplecommand-lineinterfacethatyoucanuseonaserialconsole.Youcanuseittoinspectmemory,registers,processlists,anddmesgandevensetbreakpointstostopatacertainlocation.
Toconfigureyourkernelsothatyoucancallkdbviaaserialconsole,enablekgdbasshownpreviously,andthenenablethisadditionaloption:
CONFIG_KGDB_KDB,whichisinthemenuKGDB:Kernelhacking|kerneldebugger|KGDB_KDB:Includekdbfrontendforkgdb
Now,whenyouforcethekernelintoatrap,insteadofenteringintoaGDBsession,youwillseethekdbshellontheconsole:
#echog>/proc/sysrq-trigger
[42.971126]SysRq:DEBUG
Enteringkdb(current=0xdf36c080,pid83)duetoKeyboardEntry
kdb>
Therearequiteafewthingsyoucandointhekdbshell.Thehelpcommandwillprintalloftheoptions.Hereisanoverview:
Gettinginformation:ps:ThisdisplaysactiveprocessespsA:Thisdisplaysallprocesseslsmod:Thislistsmodulesdmesg:Thisdisplaysthekernellogbuffer
Breakpoints:bp:Thissetsabreakpointbl:Thislistsbreakpointsbc:Thisclearsabreakpointbt:Thisprintsabacktracego:Thiscontinuesexecution
Inspectmemoryandregisters:
md:Thisdisplaysmemoryrd:Thisdisplaysregisters
Hereisaquickexampleofsettingabreakpoint:
kdb>bpsys_sync
Instruction(i)BP#0at0xc01304ec(sys_sync)
isenabledaddrat00000000c01304ec,hardtype=0installed=0
kdb>go
Thekernelreturnstolifeandtheconsoleshowsthenormalshellprompt.Ifyoutypesync,ithitsthebreakpointandenterskdbagain:
Enteringkdb(current=0xdf388a80,pid88)duetoBreakpoint@0xc01304ec
kdbisnotasource-leveldebugger,soyoucan'tseethesourcecodeorsingle-step.However,youcandisplayabacktraceusingthebtcommand,whichisusefultogetanideaofprogramflowandcallhierarchy.
LookingatanOopsWhenthekernelperformsaninvalidmemoryaccessorexecutesanillegalinstruction,akernelOopsmessageiswrittentothekernellog.Themostusefulpartofthisisthebacktrace,andIwanttoshowyouhowtousetheinformationtheretolocatethelineofcodethatcausedthefault.IwillalsoaddresstheproblemofpreservingOopsmessagesiftheycausethesystemtocrash.
ThisOopsmessagewasgeneratedbywritingtothemailboxdriverinMELP/chapter_14/mbx-driver-oops:
UnabletohandlekernelNULLpointerdereferenceatvirtualaddress00000004
pgd=dd064000
[00000004]*pgd=9e58a831,*pte=00000000,*ppte=00000000
Internalerror:Oops:817[#1]PREEMPTARM
Moduleslinkedin:mbx(O)
CPU:0PID:408Comm:shTainted:GO4.8.12-yocto-standard#1
Hardwarename:GenericAM33XX(FlattenedDeviceTree)
task:dd2a6a00task.stack:de596000
PCisatmbx_write+0x24/0xbc[mbx]
LRisat__vfs_write+0x28/0x48
pc:[<bf0000f0>]lr:[<c024ff40>]psr:800e0013
sp:de597f18ip:de597f38fp:de597f34
r10:00000000r9:de596000r8:00000000
r7:de597f80r6:000fda00r5:00000002r4:00000000
r3:de597f80r2:00000002r1:000fda00r0:de49ee40
Flags:NzcvIRQsonFIQsonModeSVC_32ISAARMSegmentnone
Control:10c5387dTable:9d064019DAC:00000051
Processsh(pid:408,stacklimit=0xde596210)
ThelineoftheOopsthatreadsPCisatmbx_write+0x24/0xbc[mbx]tellsyoumostofwhatyouwanttoknow:thelastinstructionwasinthembx_writefunctionofakernelmodulenamedmbx.Furthermore,itwasatoffset0x24bytesfromthestartofthefunction,whichis0xbcbyteslong.
Next,takealookatthebacktrace:
Stack:(0xde597f18to0xde598000)
7f00:bf0000cc00000002
7f20:000fda00de597f80de597f4cde597f38c024ff40bf0000d8de49ee4000000002
7f40:de597f7cde597f50c0250c40c024ff24c026eb04c026ea70de49ee40de49ee40
7f60:000fda0000000002c0107908de596000de597fa4de597f80c025187cc0250b80
7f80:000000000000000000000002000fda00b6eecd600000000400000000de597fa8
7fa0:c0107700c025183800000002000fda0000000001000fda000000000200000000
7fc0:00000002000fda00b6eecd60000000040000000200000002000ce80c00000000
7fe0:00000000bef77944b6e1afbcb6e73d00600e001000000001d3bbdad3d54367bf
[<bf0000f0>](mbx_write[mbx])from[<c024ff40>](__vfs_write+0x28/0x48)
[<c024ff40>](__vfs_write)from[<c0250c40>](vfs_write+0xcc/0x158)
[<c0250c40>](vfs_write)from[<c025187c>](SyS_write+0x50/0x88)
[<c025187c>](SyS_write)from[<c0107700>](ret_fast_syscall+0x0/0x3c)
Code:e590407ce3520b0123a02b01e1a05002(e5842004)
---[endtraceedcc51b432f0ce7d]---
Inthiscase,wedon'tlearnmuchmore,merelythatmbx_writewascalledfromthevirtualfilesystemfunction_vfs_write.
Itwouldbeverynicetofindthelineofcodethatrelatestombx_write+0x24,forwhichwecanusetheGDBcommanddisassemblewiththe/smodifiersothatitshowssourceandassemblercodetogether.Inthisexample,thecodeisinthemodulembx.ko,soweloadthatintogdb:
$arm-poky-linux-gnueabi-gdbmbx.ko
[...]
(gdb)disassemble/smbx_write
Dumpofassemblercodeforfunctionmbx_write:
99{
0x000000f0<+0>:movr12,sp
0x000000f4<+4>:push{r4,r5,r6,r7,r11,r12,lr,pc}
0x000000f8<+8>:subr11,r12,#4
0x000000fc<+12>:push{lr};(strlr,[sp,#-4]!)
0x00000100<+16>:bl0x100<mbx_write+16>
100structmbx_data*m=(structmbx_data*)file->private_data;
0x00000104<+20>:ldrr4,[r0,#124];0x7c
0x00000108<+24>:cmpr2,#1024;0x400
0x0000010c<+28>:movcsr2,#1024;0x400
101if(length>MBX_LEN)
102length=MBX_LEN;
103m->mbx_len=length;
0x00000110<+32>:movr5,r2
0x00000114<+36>:strr2,[r4,#4]
TheOopstoldusthattheerroroccurredatmbx_write+0x24.Fromthedisassembly,wecanseethatmbx_writeisataddress0xf0.Adding0x24gives0x114,whichisgeneratedbythecodeonline103.
YoumaythinkthatIhavegotthewronginstruction,becausethelistingreads0x00000114<+36>:strr2,[r4,#4].Surelywearelookingfor+24,not+36?Ah,buttheauthorsofGDBaretryingtoconfuseushere.Theoffsetsaredisplayedindecimal,nothex:36=0x24,soIgottherightoneafterall!
Youcanseefromline100thatmhasthetypestructmbx_data*.Hereistheplacewherethatstructureisdefined:
#defineMBX_LEN1024
structmbx_data{
charmbx[MBX_LEN];
intmbx_len;
};
Soitlookslikethemvariableisanullpointer,andthatiswhatiscausingtheOops.Lookingatthecodewheremisinitialized,wecanseethatthereisalinemissing.Withthedrivermodifiedtoinitializethepointer,asshownhighlightedinthefollowingcodeblock,itworksfine,withouttheOops:
staticintmbx_open(structinode*inode,structfile*file)
{
if(MINOR(inode->i_rdev)>=NUM_MAILBOXES){
printk("Invalidmbxminornumber\n");
return-ENODEV;
}
file->private_data=&mailboxes[MINOR(inode->i_rdev)];
return0;
}
PreservingtheOopsDecodinganOopsisonlypossibleifyoucancaptureitinthefirstplace.Ifthesystemcrashesduringbootbeforetheconsoleisenabled,orafterasuspend,youwon'tseeit.TherearemechanismstologkernelOopsandmessagestoanMTDpartitionortopersistentmemory,buthereisasimpletechniquethatworksinmanycasesandneedslittlepriorthought.
Solongasthecontentsofmemoryarenotcorruptedduringareset(andusuallytheyarenot),youcanrebootintothebootloaderanduseittodisplaymemory.Youneedtoknowthelocationofthekernellogbuffer,rememberingthatitisasimpleringbufferoftextmessages.Thesymbolis__log_buf.LookthisupinSystem.mapforthekernel:
$grep__log_bufSystem.map
c0f72428b__log_buf
Then,mapthatkernellogicaladdressintoaphysicaladdressthatU-BootcanunderstandbysubtractingPAGE_OFFSETandaddingthephysicalstartofRAM.PAGE_OFFSETisalmostalways0xc0000000,andthestartaddressoftheRAMis0x80000000onaBeagleBone,sothecalculationbecomesc0f72428-0xc0000000+0x80000000=80f72428.
NowyoucanusetheU-Bootmdcommandtoshowthelog:
U-Boot#
md80f72428
80f72428:000000000000000000210034c6000000........4.!.....
80f72438:746f6f4220676e69756e694c6e6f2078BootingLinuxon
80f72448:79687020616369735043206c78302055physicalCPU0x
80f72458:000000300000000000000000007300840.............s.
80f72468:a6000000756e694c657620786f697372....Linuxversio
80f72478:2e34206e30312e316863282040736972n4.1.10(chris@
80f72488:6c697562297265646367282065762063builder)(gccve
80f72498:6f6973722e34206e20312e396f726328rsion4.9.1(cro
80f724a8:6f7473734e2d6c6f2e312047302e3032sstool-NG1.20.0
80f724b8:20292029532031235720504d4f206465))#1SMPWedO
80f724c8:32207463373120383a31353a47203335ct2817:51:53G
FromLinux3.5onward,thereisa16-bytebinaryheaderforeachlineinthekernellogbufferthatencodesatimestamp,aloglevel,andotherthings.ThereisadiscussionaboutitintheLinuxweekly
newstitledTowardmorereliableloggingathttps://lwn.net/Articles/492125/.
FurtherreadingThefollowingresourceshavefurtherinformationaboutthetopicsintroducedinthischapter:
TheArtofDebuggingwithGDB,DDD,andEclipse,byNormanMatloffandPeterJaySalzman,NoStarchPress;1stedition(28Sept,2008),ISBN978-1593271749GDBPocketReferencebyArnoldRobbins,O'ReillyMedia;1stedition(12May,2005),ISBN978-0596100278GettingtogripswithEclipse:crosscompiling,http://2net.co.uk/tutorial/eclipse-cross-compileGettingtogripswithEclipse:remoteaccessanddebugging,http://2net.co.uk/tutorial/eclipse-rse
SummaryKnowinghowtouseGDBforinteractivedebuggingisausefultoolintheembeddeddeveloper'stool-chest.Itisastable,well-documented,andwell-knownentity.Ithastheabilitytodebugremotelybyplacinganagentonthetarget,beitgdbserverforapplicationsorkgdbforkernelcode,andalthoughthedefaultcommand-lineuserinterfacetakesawhiletogetusedto,therearemanyalternativefrontends.ThethreeImentionedwereTUI,DDD,andEclipseCDT,whichshouldcovermostsituations,butthereareotherfrontendsaroundthatyoucantry.
Asecondandequallyimportantwaytoapproachdebuggingistocollectcrashreportsandanalyzethemoffline.Inthiscategory,welookedatapplicationcoredumpsandkernelOopsmessages.
However,thisisonlyonewayofidentifyingflawsinprograms.Inthenextchapter,Iwilltalkaboutprofilingandtracingaswaysofanalyzingandoptimizingprograms.
ProfilingandTracingInteractivedebuggingusingasource-leveldebugger,asdescribedinthepreviouschapter,cangiveyouaninsightintothewayaprogramworks,butitconstrainsyourviewtoasmallbodyofcode.Inthischapter,wewilllookatthelargerpicturetoseewhetherthesystemisperformingasintended.
Programmersandsystemdesignersarenotoriouslybadatguessingwherebottlenecksare.Soifyoursystemhasperformanceissues,itiswisetostartbylookingatthefullsystemandthenworkdown,usingmoresophisticatedtoolsasyougo.Inthischapter,I'llbeginwiththewell-knowncommandtopasameansofgettinganoverview.Oftentheproblemcanbelocalizedtoasingleprogram,whichyoucananalyzeusingtheLinuxprofiler,perf.Iftheproblemisnotsolocalizedandyouwanttogetabroaderpicture,perfcandothataswell.Todiagnoseproblemsassociatedwiththekernel,Iwilldescribethetracetools,FtraceandLTTng,asameansofgatheringdetailedinformation.
IwillalsocoverValgrind,which,becauseofitssandboxedexecutionenvironment,canmonitoraprogramandreportoncodeasitruns.Iwillcompletethechapterwithadescriptionofasimpletracetool,strace,whichrevealstheexecutionofaprogrambytracingthesystemcallsitmakes.
Inthischapter,wewillcoverthefollowingtopics:
TheobservereffectBeginningtoprofileProfilingwithtopPoorman'sprofilerIntroducingperfOtherprofilers--OProfileandgprofTracingeventsIntroducingFtraceUsingLTTngUsingValgrindUsingstrace
TheobservereffectBeforedivingintothetools,let'stalkaboutwhatthetoolswillshowyou.Asisthecaseinmanyfields,measuringacertainpropertyaffectstheobservationitself.Measuringtheelectriccurrentinapowersupplylinerequiresmeasuringthevoltagedropoverasmallresistor.However,theresistoritselfaffectsthecurrent.Thesameistrueforprofiling:everysystemobservationhasacostinCPUcycles,andthatresourceisnolongerspentontheapplication.Measurementtoolsalsomessupcachingbehavior,eatmemoryspace,andwritetodisk,whichallmakeitworse.Thereisnomeasurementwithoutoverhead.
I'veoftenheardengineerssaythattheresultsofaprofilingjobweretotallymisleading.Thatisusuallybecausetheywereperformingthemeasurementsonsomethingnotapproachingarealsituation.Alwaystrytomeasureonthetarget,usingreleasebuildsofthesoftware,withavaliddataset,usingasfewextraservicesaspossible.
SymboltablesandcompileflagsWewillhitaproblemrightaway.Whileitisimportanttoobservethesysteminitsnaturalstate,thetoolsoftenneedadditionalinformationtomakesenseoftheevents.
Sometoolsrequirespecialkerneloptions.Forthetoolsweareexamininginthischapter,thisappliestoperf,Ftrace,andLTTng.Therefore,youwillprobablyhavetobuildanddeployanewkernelforthesetests.
Debugsymbolsareveryhelpfulintranslatingrawprogramaddressesintofunctionnamesandlinesofcode.Deployingexecutableswithdebugsymbolsdoesnotchangetheexecutionofthecode,butitdoesrequirethatyouhavecopiesofthebinariesandthekernelcompiledwithdebuginformation,atleastforthecomponentsyouwanttoprofile.Sometoolsworkbestifyouhavetheseinstalledonthetargetsystem:perf,forexample.Thetechniquesarethesameasforgeneraldebugging,asIdiscussedinChapter14,DebuggingwithGDB.
Ifyouwantatooltogeneratecallgraphs,youmayhavetocompilewithstackframesenabled.Ifyouwantthetooltoattributeaddresseswithlinesofcodeaccurately,youmayneedtocompilewithlowerlevelsofoptimization.
Finally,sometoolsrequireinstrumentationtobeinsertedintotheprogramtocapturesamples,soyouwillhavetorecompilethosecomponents.ThisappliestogprofforapplicationsandFtraceandLTTngforthekernel.
Beawarethatthemoreyouchangethesystemyouareobserving,theharderitistorelatethemeasurementsyoumaketotheproductionsystem.
Itisbesttoadoptawait-and-seeapproach,makingchangesonlywhentheneedisclear,andbeingmindfulthateachtimeyoudoso,youwillchangewhatyouaremeasuring.
BeginningtoprofileWhenlookingattheentiresystem,agoodplacetostartiswithasimpletoolsuchastop,whichgivesyouanoverviewveryquickly.Itshowsyouhowmuchmemoryisbeingused,whichprocessesareeatingCPUcycles,andhowthisisspreadacrossdifferentcoresandtime.
IftopshowsthatasingleapplicationisusingupalltheCPUcyclesinuserspace,thenyoucanprofilethatapplicationusingperf.
IftwoormoreprocesseshaveahighCPUusage,thereisprobablysomethingthatiscouplingthemtogether,perhapsdatacommunication.Ifalotofcyclesarespentinsystemcallsorhandlinginterrupts,thentheremaybeanissuewiththekernelconfigurationorwithadevicedriver.Ineithercase,youneedtostartbytakingaprofileofthewholesystem,againusingperf.
Ifyouwanttofindoutmoreaboutthekernelandthesequencingofeventsthere,youwoulduseFtraceorLTTng.
Therecouldbeotherproblemsthattopwillnothelpyouwith.Ifyouhavemulti-threadedcodeandthereareproblemswithlockups,orifyouhaverandomdatacorruption,thenValgrindplustheHelgrindpluginmightbehelpful.Memoryleaksalsofitintothiscategory:Icoveredmemory-relateddiagnosisinChapter13,ManagingMemory.
ProfilingwithtopThetopprogramisasimpletoolthatdoesn'trequireanyspecialkerneloptionsorsymboltables.ThereisabasicversioninBusyBoxandamorefunctionalversionintheprocpspackage,whichisavailableintheYoctoProjectandBuildroot.Youmayalsowanttoconsiderusinghtop,whichisfunctionallysimilartotopbuthasaniceruserinterface(somepeoplethink).
Tobeginwith,focusonthesummarylineoftop,whichisthesecondlineifyouareusingBusyBoxandthethirdlineifusingtopfromprocps.Hereisanexample,usingBusyBoxtop:
Mem:57044Kused,446172Kfree,40Kshrd,3352Kbuff,34452Kcached
CPU:58%usr4%sys0%nic0%idle37%io0%irq0%sirq
Loadaverage:0.240.060.022/51105
PIDPPIDUSERSTATVSZ%VSZ%CPUCOMMAND
105104rootR279126%61%ffmpeg-itrack2.wav
[...]
Thesummarylineshowsthepercentageoftimespentrunninginvariousstates,asshowninthistable:
procps BusyBox Description
us usr Userspaceprogramswithdefaultnicevalue
sy sys Kernelcode
ni nic Userspaceprogramswithnon-defaultnicevalue
id idle Idle
wa io I/Owait
hi irq Hardwareinterrupts
si sirq Softwareinterrupts
st - Stealtime:onlyrelevantinvirtualenvironments
Intheprecedingexample,almostallofthetime(58%)isspentinusermode,withasmallamount(4%)insystemmode,sothisisasystemthatisCPU-boundinuserspace.Thefirstlineafterthesummaryshowsthatjustoneapplicationisresponsible:ffmpeg.AnyeffortstowardreducingCPUusageshouldbedirectedthere.
Hereisanotherexample:
Mem:13128Kused,490088Kfree,40Kshrd,0Kbuff,2788Kcached
CPU:0%usr99%sys0%nic0%idle0%io0%irq0%sirq
Loadaverage:0.410.110.042/4697
PIDPPIDUSERSTATVSZ%VSZ%CPUCOMMAND
9282rootR21520%100%cat/dev/urandom
[...]
Thissystemisspendingalmostallofthetimeinkernelspace(99%sys),asaresultofcatreadingfrom/dev/urandom.Inthisartificialcase,profilingcatbyitselfwouldnothelp,butprofilingthekernelfunctionsthatcatcallsmightbe.
Thedefaultviewoftopshowsonlyprocesses,sotheCPUusageisthetotalofallthethreadsintheprocess.PressHtoseeinformationforeachthread.Likewise,itaggregatesthetimeacrossallCPUs.IfyouareusingtheprocpsversionoftopyoucanseeasummaryperCPUbypressingthe1key.
Poorman'sprofilerYoucanprofileanapplicationjustbyusingGDBtostopitatarbitraryintervalstoseewhatitisdoing.Thisisthepoorman'sprofiler.Itiseasytosetupandisonewayofgatheringprofiledata.
Theprocedureissimple:
1. Attachtotheprocessusinggdbserver(foraremotedebug)orGDB(foranativedebug).Theprocessstops.
2. Observethefunctionitstoppedin.YoucanusethebacktraceGDBcommandtoseethecallstack.
3. Typecontinuesothattheprogramresumes.4. Afterawhile,typeCtrl+Ctostopitagain,andgobacktostep2.
Ifyourepeatsteps2to4severaltimes,youwillquicklygetanideaofwhetheritisloopingormakingprogress,andifyourepeatthemoftenenough,youwillgetanideaofwherethehotspotsinthecodeare.
Thereisawholewebpagededicatedtotheideaathttp://poormansprofiler.org,togetherwithscriptsthatmakeitalittleeasier.Ihaveusedthistechniquemanytimesovertheyearswithvariousoperatingsystemsanddebuggers.
Thisisanexampleofstatisticalprofiling,inwhichyousampletheprogramstateatintervals.Afteranumberofsamples,youbegintolearnthestatisticallikelihoodofthefunctionsbeingexecuted.Itissurprisinghowfewyoureallyneed.Otherstatisticalprofilersareperfrecord,OProfile,andgprof.
Samplingusingadebuggerisintrusivebecausetheprogramisstoppedforasignificantperiodwhileyoucollectthesample.Othertoolscandothiswithmuchloweroverhead.
IntroducingperfperfisanabbreviationoftheLinuxperformanceeventcountersubsystem,perf_events,andalsothenameofthecommand-linetoolforinteractingwithperf_events.BothhavebeenpartofthekernelsinceLinux2.6.31.ThereisplentyofusefulinformationintheLinuxsourcetreeintools/perf/Documentationaswellasathttps://perf.wiki.kernel.org.
Theinitialimpetusfordevelopingperfwastoprovideaunifiedwaytoaccesstheregistersoftheperformancemeasurementunit(PMU),whichispartofmostmodernprocessorcores.OncetheAPIwasdefinedandintegratedintoLinux,itbecamelogicaltoextendittocoverothertypesofperformancecounters.
Atitsheart,perfisacollectionofeventcounterswithrulesaboutwhentheyactivelycollectdata.Bysettingtherules,youcancapturedatafromthewholesystem,orjustthekernel,orjustoneprocessanditschildren,anddoitacrossallCPUsorjustoneCPU.Itisveryflexible.Withthisonetool,youcanstartbylookingatthewholesystem,thenzeroinonadevicedriverthatseemstobecausingproblems,oranapplicationthatisrunningslowly,oralibraryfunctionthatseemstobetakinglongertoexecutethanyouthought.
Thecodefortheperfcommand-linetoolispartofthekernel,inthetools/perfdirectory.Thetoolandthekernelsubsystemaredevelopedhandinhand,meaningthattheymustbefromthesameversionofthekernel.perfcandoalot.Inthischapter,Iwillexamineitonlyasaprofiler.Foradescriptionofitsothercapabilities,readtheperfmanpagesandrefertothedocumentationmentionedatthestartofthissection.
ConfiguringthekernelforperfYouneedakernelthatisconfiguredforperf_events,andyouneedtheperfcommandcrosscompiledtorunonthetarget.TherelevantkernelconfigurationisCONFIG_PERF_EVENTS,presentinthemenuGeneralsetup|KernelPerformanceEventsAndCounters.
Ifyouwanttoprofileusingtracepoints—moreonthissubjectlater—alsoenabletheoptionsdescribedinthesectionaboutFtrace.Whileyouarethere,itisworthwhileenablingCONFIG_DEBUG_INFOaswell.
Theperfcommandhasmanydependencies,whichmakescrosscompilingitquitemessy.However,boththeYoctoProjectandBuildroothavetargetpackagesforit.
Youwillalsoneeddebugsymbolsonthetargetforthebinariesthatyouareinterestedinprofiling;otherwise,perfwillnotbeabletoresolveaddressestomeaningfulsymbols.Ideally,youwantdebugsymbolsforthewholesystem,includingthekernel.Forthelatter,rememberthatthedebugsymbolsforthekernelareinthevmlinuxfile.
BuildingperfwiththeYoctoProjectIfyouareusingthestandardlinux-yoctokernel,perf_eventsisenabledalready,sothereisnothingmoretodo.
Tobuildtheperftool,youcanadditexplicitlytothetargetimagedependencies,oryoucanaddthetools-profilefeature,whichalsobringsingprof.AsImentionedpreviously,youwillprobablywantdebugsymbolsonthetargetimageandalsothekernelvmlinuximage.Intotal,thisiswhatyouwillneedinconf/local.conf:
EXTRA_IMAGE_FEATURES="debug-tweaksdbg-pkgstools-profile"
IMAGE_INSTALL_append="kernel-vmlinux"
BuildingperfwithBuildrootManyBuildrootkernelconfigurationsdonotincludeperf_events,soyoushouldbeginbycheckingthatyourkernelincludestheoptionsmentionedintheprecedingsection.
Tocrosscompileperf,runtheBuildrootmenuconfigandselectthefollowing:
BR2_LINUX_KERNEL_TOOL_PERFinKernel|LinuxKernelTools.
Tobuildpackageswithdebugsymbolsandinstallthemunstrippedonthetarget,selectthesetwosettings:
BR2_ENABLE_DEBUGinthemenuBuildoptions|buildpackageswithdebuggingsymbols.BR2_STRIP=noneinthemenuBuildoptions|stripcommandforbinariesontarget.
Then,runmakeclean,followedbymake.
Whenyouhavebuilteverything,youwillhavetocopyvmlinuxintothetargetimagemanually.
ProfilingwithperfYoucanuseperftosamplethestateofaprogramusingoneoftheeventcountersandaccumulatesamplesoveraperiodoftimetocreateaprofile.Thisisanotherexampleofstatisticalprofiling.Thedefaulteventcounteriscalledcycles,whichisagenerichardwarecounterthatismappedtoaPMUregisterrepresentingacountofcyclesatthecoreclockfrequency.
Creatingaprofileusingperfisatwo-stageprocess:theperfrecordcommandcapturessamplesandwritesthemtoafilenamedperf.data(bydefault),andthenperfreportanalyzestheresults.Bothcommandsarerunonthetarget.Thesamplesbeingcollectedarefilteredfortheprocessandchildrenofacommandyouspecify.Hereisanexampleprofilingashellscriptthatsearchesforthestringlinux:
#perfrecordsh-c"find/usr/share|xargsgreplinux>/dev/null"
[perfrecord:Wokenup2timestowritedata]
[perfrecord:Capturedandwrote0.368MBperf.data(~16057samples)]
#ls-lperf.data
-rw-------1rootroot387360Aug252015perf.data
Nowyoucanshowtheresultsfromperf.datausingthecommandperfreport.Therearethreeuserinterfacesyoucanselectonthecommandline:
--stdio:Thisisapuretextinterfacewithnouserinteraction.Youwillhavetolaunchperfreportandannotateforeachviewofthetrace.--tui:Thisisasimpletext-basedmenuinterfacewithtraversalbetweenscreens.--gtk:Thisisagraphicalinterfacethatotherwiseactsinthesamewayas--tui.
ThedefaultisTUI,asshowninthisexample:
perfisabletorecordthekernelfunctionsexecutedonbehalfoftheprocessesbecauseitcollectssamplesinkernelspace.
Thelistisorderedwiththemostactivefunctionsfirst.Inthisexample,allbutonearecapturedwhilegrepisrunning.Someareinalibrary,libc-2.20,someinaprogram,busybox.nosuid,andsomeareinthekernel.Wehavesymbolnamesforprogramandlibraryfunctionsbecauseallthebinarieshavebeeninstalledonthetargetwithdebuginformation,andkernelsymbolsarebeingreadfrom/boot/vmlinux.Ifyouhavevmlinuxinadifferentlocation,add-k<path>totheperfreportcommand.Ratherthanstoringsamplesinperf.data,youcansavethemtoadifferentfileusingperfrecord-o<filename>andanalyzethemusingperfreport-i<filename>.
Bydefault,perfrecordsamplesatafrequencyof1000Hzusingthecyclescounter.
Asamplingfrequencyof1000Hzmaybehigherthanyoureallyneed,andmaybethecauseofanobservereffect.Trywithlowerrates:100Hzisenoughformostcases,inmyexperience.Youcansetthesamplefrequencyusingthe-Foption.
CallgraphsThisisstillnotreallymakinglifeeasy;thefunctionsatthetopofthelistaremostlylow-levelmemoryoperations,andyoucanbefairlysurethattheyhavealreadybeenoptimized.Itwouldbenicetostepbackandseewherethesefunctionsarebeingcalledfrom.Youcandothatbycapturingthebacktracefromeachsample,whichyoucandowiththe-goptiontoperfrecord.
Now,perfreportshowsaplussign(+)wherethefunctionispartofacallchain.Youcanexpandthetracetoseethefunctionslowerdowninthechain:
Generatingcallgraphsreliesontheabilitytoextractcallframesfromthestack,justasisnecessaryforbacktracesinGDB.Theinformationneededtounwindstacksisencodedinthedebuginformationoftheexecutables,butnotallcombinationsofarchitectureandtoolchainsarecapableofdoingso.
perfannotateNowthatyouknowwhichfunctionstolookat,itwouldbenicetostepinsideandseethecodeandtohavehitcountsforeachinstruction.Thatiswhatperfannotatedoes,bycallingdowntoacopyofobjdumpinstalledonthetarget.Youjustneedtouseperfannotateinplaceofperfreport.
perfannotaterequiressymboltablesfortheexecutablesandvmlinux.Hereisanexampleofanannotatedfunction:
Ifyouwanttoseethesourcecodeinterleavedwiththeassembler,youcancopytherelevantsourcefilestothetargetdevice.IfyouareusingtheYoctoProjectandbuildwiththeextraimagefeaturedbg-pkgs,orhaveinstalledtheindividual-dbgpackage,thenthesourcewillhavebeeninstalledforyouin/usr/src/debug.Otherwise,youcanexaminethedebuginformationtoseethelocationofthesourcecode:
$arm-buildroot-linux-gnueabi-objdump--dwarflib/libc-2.19.so|
grepDW_AT_comp_dir
<3f>DW_AT_comp_dir:/home/chris/buildroot/output/build/host-
gcc-initial-4.8.3/build/arm-buildroot-linux-gnueabi/libgcc
ThepathonthetargetshouldbeexactlythesameasthepathyoucanseeinDW_AT_comp_dir.
Hereisanexampleofannotationwithsourceandassemblercode:
Otherprofilers–OProfileandgprofOProfileandgprofaretwostatisticalprofilersthatpredateperf.Theyarebothoffersubsetsofthefunctionalityofperf,buttheyarestillquitepopular.Iwillmentionthemonlybriefly.
OProfileisakernelprofilerthatstartedoutin2002.Originally,ithaditsownkernelsamplingcode,butrecentversionsusetheperf_eventsinfrastructureforthatpurpose.Thereismoreinformationaboutitathttp://oprofile.sourceforge.net.OProfileconsistsofakernel-spacecomponentandauserspacedaemonandanalysiscommands.
OProfileneedsthesetwokerneloptionstobeenabled:
CONFIG_PROFILINGinGeneralsetup|ProfilingsupportCONFIG_OPROFILEinGeneralsetup|OProfilesystemprofiling
IfyouareusingtheYoctoProject,theuserspacecomponentsareinstalledaspartofthetools-profileimagefeature.IfyouareusingBuildroot,thepackageisenabledbyBR2_PACKAGE_OPROFILE.
Youcancollectsamplesusingthiscommand:
#operf<program>
Waitforyourapplicationtofinish,orpressCtrl+C,tostopprofiling.Theprofiledataisstoredin<cur-dir>/oprofile_data/samples/current.
Useopreporttogenerateaprofilesummary.Therearevariousoptions,whicharedocumentedintheOProfilemanual.
gprofispartoftheGNUtoolchainandwasoneoftheearliestopensourcecode-profilingtools.Itcombinescompile-timeinstrumentationandsamplingtechniques,usinga100Hzsamplerate.Ithastheadvantagethatitdoesnotrequirekernelsupport.
Toprepareaprogramforprofilingwithgprof,youadd-pgtothecompileandlink
flags,whichinjectscodethatcollectsinformationaboutthecalltreeintothefunctionpreamble.Whenyouruntheprogram,samplesarecollectedandstoredinabuffer,whichiswrittentoafilenamedgmon.outwhentheprogramterminates.
Youusethegprofcommandtoreadthesamplesfromgmon.out,togetherwiththedebuginformationfromacopyoftheprogram.
Asanexample,ifyouwantedtoprofiletheBusyBoxgrepapplet,youwouldrebuildBusyBoxwiththe-pgoption,runthecommand,andviewtheresults:
#busyboxgrep"linux"*
#ls-lgmon.out
-rw-r--r--1rootroot473Nov2414:07gmon.out
Then,youwouldanalyzethecapturedsamplesoneitherthetargetorthehost,usingthefollowing:
#gprofbusybox
Flatprofile:
Eachsamplecountsas0.01seconds.
notimeaccumulated
%cumulativeselfselftotal
timesecondssecondscallsTs/callTs/callname
0.000.000.006880.000.00xrealloc
0.000.000.003450.000.00bb_get_chunk_from_file
0.000.000.003450.000.00xmalloc_fgetline
0.000.000.0060.000.00fclose_if_not_stdin
0.000.000.0060.000.00fopen_for_read
0.000.000.0060.000.00grep_file
[...]
Callgraph
granularity:eachsamplehitcovers2byte(s)notimepropagated
index%timeselfchildrencalledname
0.000.00688/688bb_get_chunk_from_file[2]
[1]0.00.000.00688xrealloc[1]
----------------------------------------------------------
0.000.00345/345xmalloc_fgetline[3]
[2]0.00.000.00345bb_get_chunk_from_file[2]
0.000.00688/688xrealloc[1]
---------------------------------------------------------
0.000.00345/345grep_file[6]
[3]0.00.000.00345xmalloc_fgetline[3]
0.000.00345/345bb_get_chunk_from_file[2]
--------------------------------------------------------
0.000.006/6grep_main[12]
[4]0.00.000.006fclose_if_not_stdin[4]
[...]
Theexecutiontimesareallshownaszerobecausemostofthetimewasspentinsystemcalls,whicharenottracedbygprof.
gprofdoesnotcapturesamplesfromthreadsotherthanthemainthreadofamulti-threadedprocess,anditdoesnotsamplekernelspace,bothofwhichlimititsusefulness.
TracingeventsThetoolswehaveseensofarallusestatisticalsampling.Youoftenwanttoknowmoreabouttheorderingofeventssothatyoucanseethemandrelatethemtoeachother.Functiontracinginvolvesinstrumentingthecodewithtracepointsthatcaptureinformationabouttheevent,andmayincludesomeorallofthefollowing:
TimestampContext,suchasthecurrentPIDFunctionparametersandreturnvalueCallstack
Itismoreintrusivethanstatisticalprofiling,anditcangeneratealargeamountofdata.Thelattercanbemitigatedbyapplyingfilterswhenthesampleiscaptured,andlateronwhenviewingthetrace.
Iwillcovertwotracetoolshere:thekernelfunctiontracers,FtraceandLTTng.
IntroducingFtraceThekernelfunctiontracer,Ftrace,evolvedfromworkdonebyStevenRostedt,andmanyothers,astheyweretrackingdownthecausesofhighschedulinglatencyinreal-timeapplications.FtraceappearedinLinux2.6.27andhasbeenactivelydevelopedsincethen.ThereareanumberofdocumentsdescribingkerneltracinginthekernelsourceinDocumentation/trace.
Ftraceconsistsofanumberoftracersthatcanlogvarioustypesofactivityinthekernel.Here,Iamgoingtotalkaboutthefunctionandfunction_graphtracersandabouttheeventtracepoints.InChapter16,Real-TimeProgramming,IwillrevisitFtraceanduseittoshowreal-timelatencies.
Thefunctiontracerinstrumentseachkernelfunctionsothatcallscanberecordedandtimestamped.Asamatterofinterest,itcompilesthekernelwiththe-pgswitchtoinjecttheinstrumentation,buttheresemblancetogprofendsthere.Thefunction_graphtracergoesfurtherandrecordsboththeentryandexitoffunctionssothatitcancreateacallgraph.Theeventtracepointsfeaturealsorecordsparametersassociatedwiththecall.
Ftracehasaveryembedded-friendlyuserinterfacethatisentirelyimplementedthroughvirtualfilesinthedebugfsfilesystem,meaningthatyoudonothavetoinstallanytoolsonthetargettomakeitwork.Nevertheless,thereareotheruserinterfacesifyouprefer:trace-cmdisacommand-linetoolthatrecordsandviewstraces,andisavailableinBuildroot(BR2_PACKAGE_TRACE_CMD)andtheYoctoProject(trace-cmd).ThereisagraphicaltraceviewernamedKernelShark,whichisavailableasapackagefortheYoctoProject.
PreparingtouseFtraceFtraceanditsvariousoptionsareconfiguredinthekernelconfigurationmenu.Youwillneedthefollowingasaminimum:
CONFIG_FUNCTION_TRACERinthemenuKernelhacking|Tracers|KernelFunctionTracer
Forreasonsthatwillbecomeclearlater,youwouldbewelladvisedtoturnontheseoptionsaswell:
CONFIG_FUNCTION_GRAPH_TRACERinthemenuKernelhacking|Tracers|KernelFunctionGraphTracerCONFIG_DYNAMIC_FTRACEinthemenuKernelhacking|Tracers|enable/disablefunctiontracingdynamically
Sincethewholethingishostedinthekernel,thereisnouserspaceconfigurationtobedone.
UsingFtraceBeforeyoucanuseFtrace,youhavetomountthedebugfsfilesystem,whichbyconventiongoesinthe/sys/kernel/debugdirectory:
#mount-tdebugfsnone/sys/kernel/debug
AllthecontrolsforFtraceareinthe/sys/kernel/debug/tracingdirectory;thereisevenaminiHOWTOintheREADMEfilethere.
Thisisthelistoftracersavailableinthekernel:
#cat/sys/kernel/debug/tracing/available_tracers
blkfunction_graphfunctionnop
Theactivetracerisshownbycurrent_tracer,whichinitiallywillbethenulltracer,nop.
Tocaptureatrace,selectthetracerbywritingthenameofoneoftheavailable_tracerstocurrent_tracer,andthenenabletracingforashortwhile,asshownhere:
#echofunction>/sys/kernel/debug/tracing/current_tracer
#echo1>/sys/kernel/debug/tracing/tracing_on
#sleep1
#echo0>/sys/kernel/debug/tracing/tracing_on
Inthatonesecond,thetracebufferwillhavebeenfilledwiththedetailsofeveryfunctioncalledbythekernel.Theformatofthetracebufferisplaintext,asdescribedinDocumentation/trace/ftrace.txt.Youcanreadthetracebufferfromthetracefile:
#cat/sys/kernel/debug/tracing/trace
#tracer:function
#
#entries-in-buffer/entries-written:40051/40051#P:1
#
#_-----=>irqs-off
#/_----=>need-resched
#|/_---=>hardirq/softirq
#||/_--=>preempt-depth
#|||/delay
#TASK-PIDCPU#||||TIMESTAMPFUNCTION
#|||||||||
sh-361[000]...1992.990646:mutex_unlock<-rb_simple_write
sh-361[000]...1992.990658:__fsnotify_parent<-vfs_write
sh-361[000]...1992.990661:fsnotify<-vfs_write
sh-361[000]...1992.990663:__srcu_read_lock<-fsnotify
sh-361[000]...1992.990666:preempt_count_add<-__srcu_read_lock
sh-361[000]...2992.990668:preempt_count_sub<-__srcu_read_lock
sh-361[000]...1992.990670:__srcu_read_unlock<-fsnotify
sh-361[000]...1992.990672:__sb_end_write<-vfs_write
sh-361[000]...1992.990674:preempt_count_add<-__sb_end_write
[...]
Youcancapturealargenumberofdatapointsinjustonesecond:inthiscase,over40,000.
Aswithprofilers,itisdifficulttomakesenseofaflatfunctionlistlikethis.Ifyouselectthefunction_graphtracer,Ftracecapturescallgraphslikethis:
#tracer:function_graph
#
#CPUDURATIONFUNCTIONCALLS
#|||||||
0)+63.167us|}/*cpdma_ctlr_int_ctrl*/
0)+73.417us|}/*cpsw_intr_disable*/
0)|disable_irq_nosync(){
0)|__disable_irq_nosync(){
0)|__irq_get_desc_lock(){
0)0.541us|irq_to_desc();
0)0.500us|preempt_count_add();
0)+16.000us|}
0)|__disable_irq(){
0)0.500us|irq_disable();
0)8.208us|}
0)|__irq_put_desc_unlock(){
0)0.459us|preempt_count_sub();
0)8.000us|}
0)+55.625us|}
0)+63.375us|}
Nowyoucanseethenestingofthefunctioncalls,delimitedbybraces,{and}.Attheterminatingbrace,thereisameasurementofthetimetakeninthefunction,annotatedwithaplussign(+)ifittakesmorethan10µsandanexclamationmark(!)ifittakesmorethan100µs.
Youareoftenonlyinterestedinthekernelactivitycausedbyasingleprocessorthread,inwhichcaseyoucanrestrictthetracetotheonethreadbywritingthethreadIDtoset_ftrace_pid.
DynamicFtraceandtracefiltersEnablingCONFIG_DYNAMIC_FTRACEallowsFtracetomodifythefunctiontracesitesatruntime,whichhasacoupleofbenefits.Firstly,ittriggersadditionalbuild-timeprocessingofthetracefunctionprobes,whichallowstheFtracesubsystemtolocatethematboottimeandoverwritethemwithNOPinstructions,thusreducingtheoverheadofthefunctiontracecodetoalmostnothing.YoucanthenenableFtraceinproductionornear-productionkernelswithnoimpactonperformance.
Thesecondadvantageisthatyoucanselectivelyenablefunctiontracesitesratherthantracingeverything.Thelistoffunctionsisputintoavailable_filter_functions;thereareseveraltensofthousandsofthem.Youcanselectivelyenablefunctiontracesasyouneedthembycopyingthenamefromavailable_filter_functionstoset_ftrace_filter,andthenstoptracingthatfunctionbywritingthenametoset_ftrace_notrace.Youcanalsousewildcardsandappendnamestothelist.Forexample,supposeyouareinterestedintcphandling:
#cd/sys/kernel/debug/tracing
#echo"tcp*">set_ftrace_filter
#echofunction>current_tracer
#echo1>tracing_on
Runsometestsandthenlookatthetrace:
#cattrace
#tracer:function
#
#entries-in-buffer/entries-written:590/590#P:1
#
#_-----=>irqs-off
#/_----=>need-resched
#|/_---=>hardirq/softirq
#||/_--=>preempt-depth
#|||/delay
#TASK-PIDCPU#||||TIMESTAMPFUNCTION
#|||||||||
dropbear-375[000]...148545.022235:tcp_poll<-sock_poll
dropbear-375[000]...148545.022372:tcp_poll<-sock_poll
dropbear-375[000]...148545.022393:tcp_sendmsg<-inet_sendmsg
dropbear-375[000]...148545.022398:tcp_send_mss<-tcp_sendmsg
dropbear-375[000]...148545.022400:tcp_current_mss<-tcp_send_mss
[...]
Theset_ftrace_filterfunctioncanalsocontaincommands,forexampletostart
andstoptracingwhencertainfunctionsareexecuted.Thereisn'tspacetogointothesedetailshere,butifyouwanttofindoutmore,readtheFiltercommandssectioninDocumentation/trace/ftrace.txt.
TraceeventsThefunctionandfunction_graphtracersdescribedintheprecedingsectionrecordonlythetimeatwhichthefunctionwasexecuted.Thetraceeventsfeaturealsorecordsparametersassociatedwiththecall,makingthetracemorereadableandinformative.Forexample,insteadofjustrecordingthatthefunctionkmallochasbeencalled,atraceeventwillrecordthenumberofbytesrequestedandthereturnedpointer.TraceeventsareusedinperfandLTTngaswellasFtrace,butthedevelopmentofthetraceeventssubsystemwaspromptedbytheLTTngproject.
Ittakeseffortfromkerneldeveloperstocreatetraceevents,sinceeachoneisdifferent.TheyaredefinedinthesourcecodeusingtheTRACE_EVENTmacro:thereareoverathousandofthemnow.Youcanseethelistofeventsavailableatruntimein/sys/kernel/debug/tracing/available_events.Theyarenamedsubsystem:function,forexample,kmem:kmalloc.Eacheventisalsorepresentedbyasubdirectoryintracing/events/[subsystem]/[function],asdemonstratedhere:
#lsevents/kmem/kmalloc
enablefilterformatidtrigger
Thefilesareasfollows:
enable:Youwritea1tothisfiletoenabletheevent.filter:Thisisanexpressionthatmustevaluatetotruefortheeventtobetraced.format:Thisistheformatoftheeventandparameters.id:Thisisanumericidentifier.trigger:ThisisacommandthatisexecutedwhentheeventoccursusingthesyntaxdefinedintheFiltercommandssectionofDocumentation/trace/ftrace.txt.
Iwillshowyouasimpleexampleinvolvingkmallocandkfree.Eventtracingdoesnotdependonthefunctiontracers,sobeginbyselectingthenoptracer:
#echonop>current_tracer
Next,selecttheeventstotracebyenablingeachoneindividually:
#echo1>events/kmem/kmalloc/enable
#echo1>events/kmem/kfree/enable
Youcanalsowritetheeventnamestoset_event,asshownhere:
#echo"kmem:kmallockmem:kfree">set_event
Now,whenyoureadthetrace,youcanseethefunctionsandtheirparameters:
#tracer:nop
#
#entries-in-buffer/entries-written:359/359#P:1
#
#_-----=>irqs-off
#/_----=>need-resched
#|/_---=>hardirq/softirq
#||/_--=>preempt-depth
#|||/delay
#TASK-PIDCPU#||||TIMESTAMPFUNCTION
#|||||||||
cat-382[000]...12935.586706:kmalloc:call_site=c0554644ptr=de515a00
bytes_req=384bytes_alloc=512
gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
cat-382[000]...12935.586718:kfree:call_site=c059c2d8ptr=(null)
Exactlythesametraceeventsarevisibleinperfastracepointevents.
UsingLTTngTheLinuxTraceToolkit(LTT)projectwasstartedbyKarimYaghmourasameansoftracingkernelactivityandwasoneofthefirsttracetoolsgenerallyavailablefortheLinuxkernel.Later,MathieuDesnoyerstookuptheideaandre-implementeditasthenext-generationtracetool,LTTng.Itwasthenexpandedtocoveruserspacetracesaswellasthekernel.Theprojectwebsiteisathttp://lttng.org/andcontainsacomprehensiveusermanual.
LTTngconsistsofthreecomponents:
AcoresessionmanagerAkerneltracerimplementedasagroupofkernelmodulesAuserspacetracerimplementedasalibrary
Inadditiontothose,youwillneedatraceviewersuchasBabeltrace(http://www.efficios.com/babeltrace)ortheEclipseTraceCompassplugintodisplayandfiltertherawtracedataonthehostortarget.
LTTngrequiresakernelconfiguredwithCONFIG_TRACEPOINTS,whichisenabledwhenyouselectKernelhacking|Tracers|KernelFunctionTracer.
ThedescriptionthatfollowsreferstoLTTngversion2.5;otherversionsmaybedifferent.
LTTngandtheYoctoProjectYouneedtoaddthesepackagestothetargetdependenciesinconf/local.conf:
IMAGE_INSTALL_append="lttng-toolslttng-moduleslttng-ust"
IfyouwanttorunBabeltraceonthetarget,alsoappendthepackagebabeltrace.
LTTngandBuildrootYouneedtoenablethefollowing:
BR2_PACKAGE_LTTNG_MODULESinthemenuTargetpackages|Debugging,profilingandbenchmark|lttng-modulesBR2_PACKAGE_LTTNG_TOOLSinthemenuTargetpackages|Debugging,profilingandbenchmark|lttng-tools
Foruserspacetracetracing,enablethis:
BR2_PACKAGE_LTTNG_LIBUSTinthemenuTargetpackages|Libraries|Other,enablelttng-libust
Thereisapackagecalledlttng-babletraceforthetarget.Buildrootbuildsthehostbabeltraceautomaticallyandplacesitinoutput/host/usr/bin/babeltrace.
UsingLTTngforkerneltracingLTTngcanusethesetofftraceeventsdescribedpreviouslyaspotentialtracepoints.Initially,theyaredisabled.
ThecontrolinterfaceforLTTngisthelttngcommand.Youcanlistthekernelprobesusingthefollowing:
#lttnglist--kernel
Kernelevents:
-------------
writeback_nothread(loglevel:TRACE_EMERG(0))(type:tracepoint)
writeback_queue(loglevel:TRACE_EMERG(0))(type:tracepoint)
writeback_exec(loglevel:TRACE_EMERG(0))(type:tracepoint)
[...]
Tracesarecapturedinthecontextofasession,whichinthisexampleiscalledtest:
#lttngcreatetest
Sessiontestcreated.
Traceswillbewrittenin/home/root/lttng-traces/test-20150824-140942
#lttnglist
Availabletracingsessions:
1)test(/home/root/lttng-traces/test-20150824-140942)[inactive]
Nowenableafeweventsinthecurrentsession.Youcanenableallkerneltracepointsusingthe--alloption,butrememberthewarningaboutgeneratingtoomuchtracedata.Let'sstartwithacoupleofscheduler-relatedtraceevents:
#lttngenable-event--kernelsched_switch,sched_process_fork
Checkthateverythingissetup:
#lttnglisttest
Tracingsessiontest:[inactive]
Tracepath:/home/root/lttng-traces/test-20150824-140942
Livetimerinterval(usec):0
===Domain:Kernel===
Channels:
-------------
-channel0:[enabled]
Attributes:
overwritemode:0
subbuferssize:26214
numberofsubbufers:4
switchtimerinterval:0
readtimerinterval:200000
tracefilecount:0
tracefilesize(bytes):0
output:splice()
Events:
sched_process_fork(loglevel:TRACE_EMERG(0))(type:tracepoint)[enabled]
sched_switch(loglevel:TRACE_EMERG(0))(type:tracepoint)[enabled]
Nowstarttracing:
#lttngstart
Runthetestloadandthenstoptracing:
#lttngstop
Tracesforthesessionarewrittentothesessiondirectory,lttng-traces/<session>/kernel.
YoucanusetheBabeltraceviewertodumptherawtracedataintextformat.Inthiscase,Iranitonthehostcomputer:
$babeltracelttng-traces/test-20150824-140942/kernel
Theoutputistooverbosetofitonthispage,soIwillleaveitasanexerciseforyoutocaptureanddisplayatraceinthisway.ThetextoutputfromBabeltracedoeshavetheadvantagethatitiseasytosearchforstringsusinggrepandsimilarcommands.
AgoodchoiceforagraphicaltracevieweristheTraceCompasspluginforEclipse,whichisnowpartoftheEclipseIDEforC/C++developerbundle.ImportingthetracedataintoEclipseischaracteristicallyfiddly.Briefly,youneedtofollowthesesteps:
1. Openthetracingperspective.2. CreateanewprojectbyselectingFile|New|Tracingproject.3. EnteraprojectnameandclickonFinish.4. Right-clickontheNewProjectoptionintheProjectExplorermenuand
selectImport.5. ExpandTracingandthenselectTraceImport.6. Browsetothedirectorycontainingthetraces(forexample,test-20150824-
140942),ticktheboxtoindicatewhichsubdirectoriesyouwant(itmightbethekernel),andclickonFinish.
7. Now,expandtheprojectand,withinthat,expandTraces[1]and,withinthat,double-clickonkernel.Youshouldseethetracedatashowninthefollowingscreenshot:
Intheprecedingscreenshot,IhavezoomedinontheControlFlowviewtoshowstatetransitionsbetweendropbearandashell,andalsosomeactivityofthelttngdaemon.
UsingValgrindIintroducedValgrindinChapter13,ManagingMemory,asatoolforidentifyingmemoryproblemsusingthememchecktool.Valgrindhasotherusefultoolsforapplicationprofiling.ThetwoIamgoingtolookathereareCallgrindandHelgrind.SinceValgrindworksbyrunningthecodeinasandbox,itisabletocheckthecodeasitrunsandreportcertainbehaviors,whichnativetracersandprofilerscannotdo.
CallgrindCallgrindisacall-graph-generatingprofilerthatalsocollectsinformationaboutprocessorcachehitrateandbranchprediction.CallgrindisonlyusefulifyourbottleneckisCPUbound.It'snotusefulifheavyI/Oormultipleprocessesareinvolved.
Valgrinddoesnotrequirekernelconfiguration,butitdoesneeddebugsymbols.ItisavailableasatargetpackageinboththeYoctoProjectandBuildroot(BR2_PACKAGE_VALGRIND).
YourunCallgrindinValgrindonthetarget,likeso:
#valgrind--tool=callgrind<program>
Thisproducesafilecalledcallgrind.out.<PID>,whichyoucancopytothehostandanalyzewithcallgrind_annotate.
Thedefaultistocapturedataforallthethreadstogetherinasinglefile.Ifyouaddoption--separate-threads=yeswhencapturing,therewillbeprofilesforeachofthethreadsinfilesnamedcallgrind.out.<PID>-<threadid>,forexample,callgrind.out.122-01andcallgrind.out.122-02.
CallgrindcansimulatetheprocessorL1/L2cacheandreportoncachemisses.Capturethetracewiththe--simulate-cache=yesoption.L2missesaremuchmoreexpensivethanL1misses,sopayattentiontocodewithhighD2mrorD2mwcounts.
HelgrindThisisathread-errordetectorfordetectingsynchronizationerrorsinC,C++,andFortranprogramsthatincludePOSIXthreads.
Helgrindcandetectthreeclassesoferror.Firstly,itcandetecttheincorrectuseoftheAPI,forexample,unlockingamutexthatisalreadyunlocked,unlockingamutexthatwaslockedbyadifferentthread,ornotcheckingthereturnvalueofcertainpthreadfunctions.Secondly,itmonitorstheorderinwhichthreadsacquirelocksandthusdetectspotentialdeadlocksthatcouldarisefromtheformationofcyclesoflocks,whichisalsoknownasthedeadlyembrace.Finally,itdetectsdataraces,whichcanhappenwhentwothreadsaccessasharedmemorylocationwithoutusingsuitablelocksorothersynchronizationtoensuresingle-threadedaccess.
UsingHelgrindissimple;youjustneedthiscommand:
#valgrind--tool=helgrind<program>
Itprintsproblemsandpotentialproblemsasitfindsthem.Youcandirectthesemessagestoafilebyadding--log-file=<filename>.
UsingstraceIstartedthechapterwiththesimpleandubiquitoustool,top,andIwillfinishwithanother:strace.Itisaverysimpletracerthatcapturessystemcallsmadebyaprogramand,optionally,itschildren.Youcanuseittodothefollowing:
LearnwhichsystemcallsaprogrammakesFindthosesystemcallsthatfail,togetherwiththeerrorcode:Ifindthisusefulifaprogramfailstostartbutdoesn'tprintanerrormessageorifthemessageistoogeneralFindwhichfilesaprogramopensFindoutwhichsyscallsarunningprogramismaking,forexample,toseewhetheritisstuckinaloop
Therearemanymoreexamplesonline;justsearchforstracetipsandtricks.Everybodyhastheirownfavoritestory,forexample,http://chadfowler.com/2014/01/26/the-magic-of-strace.html.
straceusestheptrace(2)functiontohookcallsastheyaremadefromuserspacetothekernel.Ifyouwanttoknowmoreabouthowptraceworks,themanualpageisdetailedandsurprisinglyreadable.
Thesimplestwaytogetatraceistorunthecommandasaparametertostraceasshownhere(thelistinghasbeeneditedtomakeitclearer):
#strace./helloworld
execve("./helloworld",["./helloworld"],[/*14vars*/])=0
brk(0)=0x11000
uname({sys="Linux",node="beaglebone",...})=0
mmap2(NULL,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-
1,0)=0xb6f40000
access("/etc/ld.so.preload",R_OK)=-1ENOENT(Nosuchfileor
directory)
open("/etc/ld.so.cache",O_RDONLY|O_CLOEXEC)=3
fstat64(3,{st_mode=S_IFREG|0644,st_size=8100,...})=0
mmap2(NULL,8100,PROT_READ,MAP_PRIVATE,3,0)=0xb6f3e000
close(3)=0
open("/lib/tls/v7l/neon/vfp/libc.so.6",O_RDONLY|O_CLOEXEC)=-1
ENOENT(Nosuchfileordirectory)
[...]
open("/lib/libc.so.6",O_RDONLY|O_CLOEXEC)=3
read(3,
"\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0$`\1\0004\0\0\0"...,
512)=512
fstat64(3,{st_mode=S_IFREG|0755,st_size=1291884,...})=0
mmap2(NULL,1328520,PROT_READ|PROT_EXEC,MAP_PRIVATE|MAP_DENYWRITE,
3,0)=0xb6df9000
mprotect(0xb6f30000,32768,PROT_NONE)=0
mmap2(0xb6f38000,12288,PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE,3,0x137000)=0xb6f38000
mmap2(0xb6f3b000,9608,PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,-1,0)=0xb6f3b000
close(3)
[...]
write(1,"Hello,world!\n",14Hello,world!
)=14
exit_group(0)=?
+++exitedwith0+++
Mostofthetraceshowshowtheruntimeenvironmentiscreated.Inparticular,youcanseehowthelibraryloaderhuntsforlibc.so.6,eventuallyfindingitin/lib.Finally,itgetstorunningthemain()functionoftheprogram,whichprintsitsmessageandexits.
Ifyouwantstracetofollowanychildprocessesorthreadscreatedbytheoriginalprocess,addthe-foption.
Ifyouareusingstracetotraceaprogramthatcreatesthreads,youalmostcertainlywantthe-foption.Betterstill,use-ffand-o<filename>sothattheoutputforeachchildprocessorthreadiswrittentoaseparatefilenamed<filename>.<PID|TID>.
Acommonuseofstraceistodiscoverwhichfilesaprogramtriestoopenatstartup.Youcanrestrictthesystemcallsthataretracedthroughthe-eoption,andyoucanwritethetracetoafileinsteadofstdoutusingthe-ooption:
#strace-eopen-ossh-strace.txtsshlocalhost
Thisshowsthelibrariesandconfigurationfilessshopenswhenitissettingupaconnection.
Youcanevenusestraceasabasicprofiletool:ifyouusethe-coption,itaccumulatesthetimespentinsystemcallsandprintsoutasummarylikethis:
#strace-cgreplinux/usr/lib/*>/dev/null
%timesecondsusecs/callcallserrorssyscall
--------------------------------------------------------
78.680.01282511109818read
11.030.00179813551write
10.020.001634821615open
0.260.0000430202fstat64
0.000.0000000201close
0.000.00000001execve
0.000.000000011access
0.000.00000003brk
0.000.0000000199munmap
0.000.00000001uname
0.000.00000005mprotect
0.000.0000000207mmap2
0.000.00000001515stat64
0.000.00000001getuid32
0.000.00000001set_tls
---------------------------------------------------------
100.000.0163001570249total
SummaryNobodycancomplainthatLinuxlacksoptionsforprofilingandtracing.Thischapterhasgivenyouanoverviewofsomeofthemostcommonones.
Whenfacedwithasystemthatisnotperformingaswellasyouwouldlike,startwithtopandtrytoidentifytheproblem.Ifitprovestobeasingleapplication,thenyoucanuseperfrecord/reporttoprofileit,bearinginmindthatyouwillhavetoconfigurethekerneltoenableperfandyouwillneeddebugsymbolsforthebinariesandkernel.OProfileisanalternativetoperfrecordandcantellyousimilarthings.gprofis,frankly,outdated,butitdoeshavetheadvantageofnotrequiringkernelsupport.Iftheproblemisnotsowelllocalized,useperf(orOProfile)togetasystem-wideview.
Ftracecomesintoitsownwhenyouhavespecificquestionsaboutthebehaviorofthekernel.Thefunctionandfunction_graphtracersprovideadetailedviewoftherelationshipandsequenceoffunctioncalls.Theeventtracersallowyoutoextractmoreinformationaboutfunctions,includingtheparametersandreturnvalues.LTTngperformsasimilarrole,makinguseoftheeventtracemechanism,andaddshigh-speedringbufferstoextractlargequantitiesofdatafromthekernel.Valgrindhastheparticularadvantagethatitrunscodeinasandboxandcanreportonerrorsthatarehardtotrackdowninotherways.UsingtheCallgrindtool,itcangeneratecallgraphsandreportonprocessorcacheusage,andwithHelgrind,itcanreportonthread-relatedproblems.
Finally,don'tforgetstrace.Itisagoodstandbyforfindingoutwhichsystemcallsaprogramismaking,fromtrackingfileopencallstofindingfilepathnamesandcheckingforsystemwake-upsandincomingsignals.
Allthewhile,beawareof,andtrytoavoid,theobservereffect:makesurethatthemeasurementsyouaremakingarevalidforaproductionsystem.Inthenextchapter,IwillcontinuethethemeasIdelveintothelatencytracersthathelpusquantifythereal-timeperformanceofatargetsystem.
Real-TimeProgrammingMuchoftheinteractionbetweenacomputersystemandtherealworldhappensinrealtime,andsothisisanimportanttopicfordevelopersofembeddedsystems.Ihavetouchedonreal-timeprogramminginseveralplacessofar:inChapter12,LearningAboutProcessesandThreads,Ilookedatschedulingpoliciesandpriorityinversion,andinChapter13,ManagingMemory,Idescribedtheproblemswithpagefaultsandtheneedformemorylocking.Now,itistimetobringthesetopicstogetherandlookatreal-timeprogramminginsomedepth.
Inthischapter,Iwillbeginwithadiscussionaboutthecharacteristicsofreal-timesystemsandthenconsidertheimplicationsforsystemdesign,bothattheapplicationandkernellevels.Iwilldescribethereal-timekernelpatch,PREEMPT_RT,andshowhowtogetitandapplyittoamainlinekernel.Thefinalsectionswilldescribehowtocharacterizesystemlatenciesusingtwotools:cyclictestandFtrace.
Thereareotherwaystoachievereal-timebehavioronanembeddedLinuxdevice,forinstance,usingadedicatedmicrocontrolleroraseparatereal-timekernelalongsidetheLinuxkernelinthewaythatXenomaiandRTAIdo.IamnotgoingtodiscusstheseherebecausethefocusofthisbookisonusingLinuxasthecoreforembeddedsystems.
Inthischapter,wewillcoverthefollowingtopics:
Whatisrealtime?Identifyingsourcesofnon-determinism.Understandingschedulinglatency.Kernelpreemption.Thereal-timeLinuxkernel(PREEMPT_RT).High-resolutiontimers.Avoidingpagefaults.Interruptshielding.Measuringschedulinglatencies.
Whatisrealtime?Thenatureofreal-timeprogrammingisoneofthesubjectsthatsoftwareengineerslovetodiscussatlength,oftengivingarangeofcontradictorydefinitions.IwillbeginbysettingoutwhatIthinkisimportantaboutrealtime.
Ataskisareal-timetaskifithastocompletebeforeacertainpointintime,knownasthedeadline.Thedistinctionbetweenreal-timeandnonreal-timetasksisshownbyconsideringwhathappenswhenyouplayanaudiostreamonyourcomputerwhilecompilingtheLinuxkernel.Thefirstisareal-timetaskbecausethereisaconstantstreamofdataarrivingattheaudiodriver,andblocksofaudiosampleshavetobewrittentotheaudiointerfaceattheplaybackrate.Meanwhile,thecompilationisnotrealtimebecausethereisnodeadline.Yousimplywantittocompleteassoonaspossible;whetherittakes10secondsor10minutesdoesnotaffectthequalityofthekernelbinaries.
Theotherimportantthingtoconsideristheconsequenceofmissingthedeadline,whichcanrangefrommildannoyancethroughtosystemfailureor,inthemostextremecases,injuryordeath.Herearesomeexamples:
Playinganaudiostream:Thereisadeadlineintheorderoftensofmilliseconds.Iftheaudiobufferunderruns,youwillhearaclick,whichisannoying,butyouwillgetoverit.Movingandclickingamouse:Thedeadlineisalsointheorderoftensofmilliseconds.Ifitismissed,themousemoveserraticallyandbuttonclickswillbelost.Iftheproblempersists,thesystemwillbecomeunusable.Printingapieceofpaper:Thedeadlinesforthepaperfeedareinthemillisecondrange,whichifmissedmaycausetheprintertojam,andsomebodywillhavetogoandfixit.Occasionaljamsareacceptable,butnobodyisgoingtobuyaprinterthatkeepsonjamming.Printingsell-bydatesonbottlesonaproductionline:Ifonebottleisnotprinted,thewholeproductionlinehastobehalted,thebottleremoved,andthelinerestarted,whichisexpensive.Bakingacake:Thereisadeadlineof30minutesorso.Ifyoumissitbyafewminutes,thecakemightberuined.Ifyoumissitbyalot,thehouse
mayburndown.Apower-surgedetectionsystem:Ifthesystemdetectsasurge,acircuitbreakerhastobetriggeredwithin2milliseconds.Failingtodosocausesdamagetotheequipmentandmayinjureorkillpersonnel.
Inotherwords,therearemanyconsequencestomisseddeadlines.Weoftentalkaboutthesedifferentcategories:
Softreal-time:Thedeadlineisdesirablebutissometimesmissedwithoutthesystembeingconsideredafailure.Thefirsttwoexamplesinthepreviouslistarelikethis.Hardreal-time:Here,missingadeadlinehasaseriouseffect.Wecanfurthersubdividehardreal-timeintomission-criticalsystems,inwhichthereisacosttomissingthedeadline,suchasthefourthexample;andsafety-criticalsystems,inwhichthereisadangertolifeandlimb,suchasthelasttwoexamples.Iputinthebakingexampletoshowthatnotallhardreal-timesystemshavedeadlinesmeasuredinmillisecondsormicroseconds.
Softwarewrittenforsafety-criticalsystemshastoconformtovariousstandardsthatseektoensurethatitiscapableofperformingreliably.ItisverydifficultforacomplexoperatingsystemsuchasLinuxtomeetthoserequirements.
Whenitcomestomission-criticalsystems,itispossible,andcommon,forLinuxtobeusedforawiderangeofcontrolsystems.Therequirementsofthesoftwaredependonthecombinationofthedeadlineandtheconfidencelevel,whichcanusuallybedeterminedthroughextensivetesting.
Therefore,tosaythatasystemisreal-time,youhavetomeasureitsresponsetimesunderthemaximumanticipatedload,andshowthatitmeetsthedeadlineforanagreedproportionofthetime.Asaruleofthumb,awell-configuredLinuxsystemusingamainlinekernelisgoodforsoftreal-timetaskswithdeadlinesdowntotensofmilliseconds,andakernelwiththePREEMPT_RTpatchisgoodforsoftandhardreal-timemission-criticalsystemswithdeadlinesdowntoseveralhundredsofmicroseconds.
Thekeytocreatingareal-timesystemistoreducethevariabilityinresponsetimessothatyouhavegreaterconfidencethatthedeadlineswillnotbemissed;
inotherwords,youneedtomakethesystemmoredeterministic.Often,thisisdoneattheexpenseofperformance.Forexample,cachesmakesystemsrunfasterbymakingtheaveragetimetoaccessanitemofdatashorter,butthemaximumtimeislongerinthecaseofacachemiss.Cachesmakeasystemfasterbutlessdeterministic,whichistheoppositeofwhatwewant.
Itisamythofreal-timecomputingthatitisfast.Thisisnotso,themoredeterministicasystemis,thelowerthemaximumthroughput.
Theremainderofthischapterisconcernedwithidentifyingthecausesoflatencyandthethingsyoucandotoreduceit.
Identifyingsourcesofnon-determinismFundamentally,real-timeprogrammingisaboutmakingsurethatthethreadscontrollingtheoutputinrealtimearescheduledwhenneededandsocancompletethejobbeforethedeadline.Anythingthatpreventsthisisaproblem.Herearesomeproblemareas:
Scheduling:Real-timethreadsmustbescheduledbeforeothers,andsotheymusthaveareal-timepolicy,SCHED_FIFOorSCHED_RR.Additionally,theyshouldhaveprioritiesassignedindescendingorderstartingwiththeonewiththeshortestdeadline,accordingtothetheoryofRateMonotonicAnalysisthatIdescribedinChapter12,LearningAboutProcessesandThreads.Schedulinglatency:Thekernelmustbeabletorescheduleassoonasaneventsuchasaninterruptortimeroccurs,andnotbesubjecttounboundeddelays.Reducingschedulinglatencyisakeytopiclateroninthischapter.Priorityinversion:Thisisaconsequenceofpriority-basedscheduling,whichleadstounboundeddelayswhenahigh-prioritythreadisblockedonamutexheldbyalow-prioritythread,asIdescribedinChapter12,LearningAboutProcessesandThreads.Userspacehaspriorityinheritanceandpriorityceilingmutexes;inkernelspace,wehavert-mutexes,whichimplementpriorityinheritanceandwhichIwilltalkaboutinthesectiononthereal-timekernel.Accuratetimers:Ifyouwanttomanagedeadlinesintheregionoflowmillisecondsormicroseconds,youneedtimersthatmatch.High-resolutiontimersarecrucialandareaconfigurationoptiononalmostallkernels.Pagefaults:Apagefaultwhileexecutingacriticalsectionofcodewillupsetalltimingestimates.Youcanavoidthembylockingmemory,asIshalldescribelateron.Interrupts:Theyoccuratunpredictabletimesandcanresultinanunexpectedprocessingoverheadifthereisasuddenfloodofthem.Therearetwowaystoavoidthis.Oneistoruninterruptsaskernelthreads,andtheother,onmulti-coredevices,istoshieldoneormoreCPUsfrom
interrupthandling.Iwilldiscussbothpossibilitieslater.Processorcaches:TheseprovideabufferbetweentheCPUandthemainmemoryand,likeallcaches,areasourceofnon-determinism,especiallyonmulti-coredevices.Unfortunately,thisisbeyondthescopeofthisbook,butyoumaywantrefertothereferencesattheendofthechapterformoredetails.Memorybuscontention:WhenperipheralsaccessmemorydirectlythroughaDMAchannel,theyuseupasliceofmemorybusbandwidth,whichslowsdownaccessfromtheCPUcore(orcores)andsocontributestonon-deterministicexecutionoftheprogram.However,thisisahardwareissueandisalsobeyondthescopeofthisbook.
Iwillexpandontheimportantproblemsandseewhatcanbedoneabouttheminthenextsections.
UnderstandingschedulinglatencyReal-timethreadsneedtobescheduledassoonastheyhavesomethingtodo.However,eveniftherearenootherthreadsofthesameorhigherpriority,thereisalwaysadelayfromthepointatwhichthewake-upeventoccurs—aninterruptorsystemtimer—tothetimethatthethreadstartstorun.Thisiscalledtheschedulinglatency.Itcanbebrokendownintoseveralcomponents,asshowninthefollowingdiagram:
Firstly,thereisthehardwareinterruptlatencyfromthepointatwhichaninterruptisasserteduntiltheinterruptserviceroutine(ISR)beginstorun.Asmallpartofthisisthedelayintheinterrupthardwareitself,butthebiggestproblemisduetointerruptsbeingdisabledinsoftware.MinimizingthisIRQofftimeisimportant.
Thenextisinterruptlatency,whichisthelengthoftimeuntiltheISRhasservicedtheinterruptandwokenupanythreadswaitingonthisevent.ItismostlydependentonthewaytheISRwaswritten.Normally,itshouldtakeonlyashorttime,measuredinmicroseconds.
Thefinaldelayisthepreemptionlatency,whichisthetimefromthepointthatthekernelisnotifiedthatathreadisreadytoruntothatatwhichthescheduleractuallyrunsthethread.Itisdeterminedbywhetherthekernelcanbepreemptedornot.Ifitisrunningcodeinacriticalsection,thenthereschedulingwillhavetowait.Thelengthofthedelayisdependentontheconfigurationofkernelpreemption.
KernelpreemptionPreemptionlatencyoccursbecauseitisnotalwayssafeordesirabletopreemptthecurrentthreadofexecutionandcallthescheduler.MainlineLinuxhasthreesettingsforpreemption,selectedviatheKernelFeatures|PreemptionModelmenu:
CONFIG_PREEMPT_NONE:NopreemptionCONFIG_PREEMPT_VOLUNTARY:ThisenablesadditionalchecksforrequestsforpreemptionCONFIG_PREEMPT:Thisallowsthekerneltobepreempted
Withpreemptionsettonone,kernelcodewillcontinuewithoutreschedulinguntiliteitherreturnsviaasyscallbacktouserspace,wherepreemptionisalwaysallowed,oritencountersasleepingwaitthatstopsthecurrentthread.Sinceitreducesthenumberoftransitionsbetweenthekernelanduserspaceandmayreducethetotalnumberofcontextswitches,thisoptionresultsinthehighestthroughputattheexpenseoflargepreemptionlatencies.Itisthedefaultforserversandsomedesktopkernels,wherethroughputismoreimportantthanresponsiveness.
Thesecondoptionenablesexplicitpreemptionpoints,wherethescheduleriscallediftheneed_reschedflagisset,whichreducestheworst-casepreemptionlatenciesattheexpenseofslightlylowerthroughput.Somedistributionssetthisoptionondesktops.
Thethirdoptionmakesthekernelpreemptible,meaningthataninterruptcanresultinanimmediatereschedulesolongasthekernelisnotexecutinginanatomiccontext,whichIwilldescribeinthefollowingsection.Thisreducesworst-casepreemptionlatenciesand,therefore,overallschedulinglatenciestosomethingintheorderofafewmillisecondsontypicalembeddedhardware.
Thisisoftendescribedasasoftreal-timeoption,andmostembeddedkernelsareconfiguredinthisway.Ofcourse,thereisasmallreductioninoverallthroughput,butthatisusuallylessimportantthanhavingmoredeterministicschedulingforembeddeddevices.
Thereal-timeLinuxkernel(PREEMPT_RT)Thereisalong-standingefforttoreducelatenciesstillfurtherthatgoesbythenameofthekernelconfigurationoptionforthesefeatures,PREEMPT_RT.TheprojectwasstartedbyIngoMolnar,ThomasGleixner,andStevenRostedtandhashadcontributionsfrommanymoredevelopersovertheyears.Thekernelpatchesareathttps://www.kernel.org/pub/linux/kernel/projects/rt,andthereisawiki,includinganFAQ(slightlyoutofdate),athttps://rt.wiki.kernel.org.
ManypartsoftheprojecthavebeenincorporatedintomainlineLinuxovertheyears,includinghigh-resolutiontimers,kernelmutexes,andthreadedinterrupthandlers.However,thecorepatchesremainoutsideofthemainlinebecausetheyareratherintrusiveand(someclaim)onlybenefitasmallpercentageofthetotalLinuxuserbase.Maybe,oneday,thewholepatchsetwillbemergedupstream.
Thecentralplanistoreducetheamountoftimethekernelspendsrunninginanatomiccontext,whichiswhereitisnotsafetocalltheschedulerandswitchtoadifferentthread.Typicalatomiccontextsarewhenthekernelisinthefollowingstates:
Runninganinterruptortraphandler.HoldingaspinlockorinanRCUcriticalsection.SpinlockandRCUarekernel-lockingprimitives,thedetailsofwhicharenotrelevanthere.Betweencallstopreempt_disable()andpreempt_enable().Hardwareinterruptsaredisabled(IRQsoff)
ThechangesthatarepartofPREEMPT_RTfallintotwomainareas:oneistoreducetheimpactofinterrupthandlersbyturningthemintokernelthreads,andtheotheristomakelockspreemptiblesothatathreadcansleepwhileholdingone.Itisobviousthatthereisalargeoverheadinthesechanges,whichmakesaverage-caseinterrupthandlingslowerbutmuchmoredeterministic,whichiswhatwearestrivingfor.
ThreadedinterrupthandlersNotallinterruptsaretriggersforreal-timetasks,butallinterruptsstealcyclesfromreal-timetasks.Threadedinterrupthandlersallowaprioritytobeassociatedwiththeinterruptandforittobescheduledatanappropriatetime,asshowninthefollowingdiagram:
Iftheinterrupt-handlercodeisrunasakernelthread,thereisnoreasonwhyitcannotbepreemptedbyauserspacethreadofhigherpriority,andsotheinterrupthandlerdoesnotcontributetowardschedulinglatencyoftheuserspacethread.ThreadedinterrupthandlershavebeenafeatureofmainlineLinuxsince2.6.30.Youcanrequestthatanindividualinterrupthandlerbethreadedbyregisteringitwithrequest_threaded_irq()inplaceofthenormalrequest_irq().YoucanmakethreadedIRQsthedefaultbyconfiguringthekernelwithCONFIG_IRQ_FORCED_THREADING=y,whichmakesallhandlersintothreadsunlesstheyhaveexplicitlypreventedthisbysettingtheIRQF_NO_THREADflag.WhenyouapplythePREEMPT_RTpatches,interruptsare,bydefault,configuredasthreadsinthisway.Hereisanexampleofwhatyoumightsee:
#ps-Leopid,tid,class,rtprio,stat,comm,wchan|grepFF
PIDTIDCLSRTPRIOSTATCOMMANDWCHAN
33FF1Sksoftirqd/0smpboot_th
77FF99Sposixcputmr/0posix_cpu_
1919FF50Sirq/28-edmairq_thread
2020FF50Sirq/30-edma_errirq_thread
4242FF50Sirq/91-rtc0irq_thread
4343FF50Sirq/92-rtc0irq_thread
4444FF50Sirq/80-mmc0irq_thread
4545FF50Sirq/150-mmc0irq_thread
4747FF50Sirq/44-mmc1irq_thread
5252FF50Sirq/86-44e0b000irq_thread
5959FF50Sirq/52-tilcdcirq_thread
6565FF50Sirq/56-4a100000irq_thread
6666FF50Sirq/57-4a100000irq_thread
6767FF50Sirq/58-4a100000irq_thread
6868FF50Sirq/59-4a100000irq_thread
7676FF50Sirq/88-OMAPUARirq_thread
Inthiscase,aBeagleBonerunninglinux-yocto-rt,onlythegp_timerinterruptwasnotthreaded.Itisnormalthatthetimerinterrupthandlerberuninline.
TheinterruptthreadshaveallbeengiventhedefaultpolicySCHED_FIFOandapriorityof50.Itdoesn'tmakesensetoleavethemwiththedefaults;however,nowisyourchancetoassignprioritiesaccordingtotheimportanceoftheinterruptscomparedtoreal-timeuserspacethreads.
Hereisasuggestedorderofdescendingthreadpriorities:
ThePOSIXtimersthread,posixcputmr,shouldalwayshavethehighestpriority.Hardwareinterruptsassociatedwiththehighestpriorityreal-timethread.Thehighestpriorityreal-timethread.Hardwareinterruptsfortheprogressivelylower-priorityreal-timethreads,followedbythethreaditself.Thenexthighest-priorityreal-timethread.Hardwareinterruptsfornonreal-timeinterfaces.ThesoftIRQdaemon,ksoftirqd,whichonRTkernelsisresponsibleforrunningdelayedinterruptroutinesand,priortoLinux3.6,wasresponsibleforrunningthenetworkstack,theblockI/Olayer,andotherthings.Youmayneedtoexperimentwithdifferentprioritylevelstoachieveabalance.
Youcanchangetheprioritiesusingthechrtcommandaspartofthebootscript,usingacommandlikethis:
#chrt-f-p90`pgrepirq/28-edma`
Thepgrepcommandispartoftheprocpspackage.
PreemptiblekernellocksMakingthemajorityofkernellockspreemptibleisthemostintrusivechangethatPREEMPT_RTmakes,andthiscoderemainsoutsideofthemainlinekernel.
Theproblemoccurswithspinlocks,whichareusedformuchofthekernellocking.Aspinlockisabusy-waitmutexthatdoesnotrequireacontextswitchinthecontendedcase,andsoitisveryefficientaslongasthelockisheldforashorttime.Ideally,theyshouldbelockedforlessthanthetimeitwouldtaketorescheduletwice.ThefollowingdiagramshowsthreadsrunningontwodifferentCPUscontendingthesamespinlock.CPU0getsitfirst,forcingCPU1tospin,waitinguntilitisunlocked:
Thethreadthatholdsthespinlockcannotbepreemptedsincedoingsomaymakethenewthreadenterthesamecodeanddeadlockwhenittriestolockthesamespinlock.Consequently,inmainlineLinux,lockingaspinlockdisableskernelpreemption,creatinganatomiccontext.Thismeansthatalowprioritythreadthatholdsaspinlockcanpreventahigh-prioritythreadfrombeingscheduled.
ThesolutionadoptedbyPREEMPT_RTistoreplacealmostallspinlockswithRT-mutexes.Amutexisslowerthanaspinlock,butitisfullypreemptible.Notonlythat,butRT-mutexesimplementpriorityinheritanceandsoarenotsusceptibletopriorityinversion.
GettingthePREEMPT_RTpatchesTheRTdevelopersdonotcreatepatchsetsforeverykernelversionbecauseoftheamountofeffortinvolved.Onaverage,theycreatepatchesforeveryotherkernel.Themostrecentkernelsthataresupportedatthetimeofwritingareasfollows:
4.9-rt
4.8-rt
4.6-rt
4.4-rt
4.1-rt
4.0-rt
3.18-rt
3.14-rt
3.12-rt
3.10-rt
3.4-rt
3.2-rt
Thepatchesareavailableathttps://www.kernel.org/pub/linux/kernel/projects/rt.
IfyouareusingtheYoctoProject,thereisanrtversionofthekernelalready.Otherwise,itispossiblethattheplaceyougotyourkernelfromalreadyhasthePREEMPT_RTpatchapplied.Ifnot,youwillhavetoapplythepatchyourself.Firstly,makesurethatthePREEMPT_RTpatchversionandyourkernelversionmatchexactly;otherwise,youwillnotbeabletoapplythepatchescleanly.Then,youapplyitinthenormalway,asshowninthefollowingcommandlines.YouwillthenbeabletoconfigurethekernelwithCONFIG_PREEMPT_RT_FULL.
$cdlinux-4.1.10
$zcatpatch-4.1.10-rt11.patch.gz|patch-p1
Thereisaprobleminthepreviousparagraph.TheRTpatchwillonlyapplyif
youareusingacompatiblemainlinekernel.Youareprobablynot,becausethatisthenatureofembeddedLinuxkernels.Soyouwillhavetospendsometimelookingatfailedpatchesandfixingthem,andthenanalyzingtheboardsupportforyourtargetandaddinganyreal-timesupportthatismissing.Thesedetailsare,onceagain,outsidethescopeofthisbook.Ifyouarenotsurewhattodo,youshouldinquireofthedevelopersofthekernelyouareusingandonkernel-developerforums.
TheYoctoProjectandPREEMPT_RTTheYoctoProjectsuppliestwostandardkernelrecipes:linux-yoctoand,thelatterhavingthereal-timepatchesalreadyapplied.AssumingthatyourtargetissupportedbytheYoctokernels,youjustneedtoselectlinux-yocto-rtasyourpreferredkernelanddeclarethatyourmachineiscompatible,forexample,byaddinglinessimilartothesetoyourconf/local.conf:
PREFERRED_PROVIDER_virtual/kernel="linux-yocto-rt"
COMPATIBLE_MACHINE_beaglebone="beaglebone"
High-resolutiontimersTimerresolutionisimportantifyouhaveprecisetimingrequirements,whichistypicalforreal-timeapplications.ThedefaulttimerinLinuxisaclockthatrunsataconfigurablerate,typically100Hzforembeddedsystemsand250Hzforserversanddesktops.Theintervalbetweentwotimerticksisknownasajiffyand,intheexamplesgivenpreviously,is10millisecondsonanembeddedSoCandfourmillisecondsonaserver.
Linuxgainedmoreaccuratetimersfromthereal-timekernelprojectinversion2.6.18,andnowtheyareavailableonallplatforms,providedthatthereisahigh-resolutiontimersourceanddevicedriverforit—whichisalmostalwaysthecase.YouneedtoconfigurethekernelwithCONFIG_HIGH_RES_TIMERS=y.
Withthisenabled,allthekernelanduserspaceclockswillbeaccuratedowntothegranularityoftheunderlyinghardware.Findingtheactualclockgranularityisdifficult.Theobviousansweristhevalueprovidedbyclock_getres(2),butthatalwaysclaimsaresolutionofonenanosecond.ThecyclictesttoolthatIwilldescribelaterhasanoptiontoanalyzethetimesreportedbytheclocktoguesstheresolution:
#cyclictest-R
#/dev/cpu_dma_latencysetto0us
WARN:reportedclockresolution:1nsec
WARN:measuredclockresolutionapproximately:708nsec
Youcanalsolookatthekernellogmessagesforstringslikethis:
#dmesg|grepclock
OMAPclockeventsource:timer2at24000000Hz
sched_clock:32bitsat24MHz,resolution41ns,wrapsevery
178956969942ns
OMAPclocksource:timer1at24000000Hz
Switchedtoclocksourcetimer1
Thetwomethodsprovideratherdifferentnumbers,forwhichIhavenogoodexplanation,butsincebotharebelowonemicrosecond,Iamhappy.
AvoidingpagefaultsApagefaultoccurswhenanapplicationreadsorwritestomemorythatisnotcommittedtophysicalmemory.Itisimpossible(orveryhard)topredictwhenapagefaultwillhappen,sotheyareanothersourceofnon-determinismincomputers.
Fortunately,thereisafunctionthatallowsyoutocommitallthememoryusedbytheprocessandlockitdownsothatitcannotcauseapagefault.Itismlockall(2).Theseareitstwoflags:
MCL_CURRENT:ThislocksallpagescurrentlymappedMCL_FUTURE:Thislockspagesthataremappedinlater
Youusuallycallmlockallduringthestartupoftheapplicationwithbothflagssettolockallcurrentandfuturememorymappings.
MCL_FUTUREisnotmagic,inthattherewillstillbenon-deterministicdelaywhenallocatingorfreeingheapmemoryusingmalloc()/free()ormmap().Suchoperationsarebestdoneatstartupandnotinthemaincontrolloops.
Memoryallocatedonthestackistrickierbecauseitisdoneautomatically,andifyoucallafunctionthatmakesthestackdeeperthanbefore,youwillencountermorememory-managementdelays.Asimplefixistogrowthestacktoasizelargerthanyouthinkyouwilleverneedatstartup.Thecodewouldlooklikethis:
#defineMAX_STACK(512*1024)
staticvoidstack_grow(void)
{
chardummy[MAX_STACK];
memset(dummy,0,MAX_STACK);
return;
}
intmain(intargc,char*argv[])
{
[...]
stack_grow();
mlockall(MCL_CURRENT|MCL_FUTURE);
[...]
Thestack_grow()functionallocatesalargevariableonthestackandthenzeroesittoforcethosepagesofmemorytobecommittedtothisprocess.
InterruptshieldingUsingthreadedinterrupthandlershelpsmitigateinterruptoverheadbyrunningsomethreadsatahigherprioritythaninterrupthandlersthatdonotimpactreal-timetasks.Ifyouareusingamulti-coreprocessor,youcantakeadifferentapproachandshieldoneormorecoresfromprocessinginterruptscompletely,allowingthemtobededicatedtoreal-timetasksinstead.ThisworkseitherwithanormalLinuxkerneloraPREEMPT_RTkernel.
Achievingthisisaquestionofpinningthereal-timethreadstooneCPUandtheinterrupthandlerstoadifferentone.YoucansettheCPUaffinityoffathreadorprocessusingthecommand-linetooltaskset,oryoucanusethesched_setaffinity(2)andpthread_setaffinity_np(3)functions.
Tosettheaffinityofaninterrupt,firstnotethatthereisasubdirectoryforeachinterruptnumberin/proc/irq/<IRQnumber>.Thecontrolfilesfortheinterruptareinthere,includingaCPUmaskinsmp_affinity.WriteabitmasktothatfilewithabitsetforeachCPUthatisallowedtohandlethatIRQ.
MeasuringschedulinglatenciesAlltheconfigurationandtuningyoumaydowillbepointlessifyoucannotshowthatyourdevicemeetsthedeadlines.Youwillneedyourownbenchmarksforthefinaltesting,butIwilldescribeheretwoimportantmeasurementtools:cyclictestandFtrace.
cyclictestcyclictestwasoriginallywrittenbyThomasGleixnerandisnowavailableonmostplatformsinapackagenamedrt-tests.IfyouareusingtheYoctoProject,youcancreateatargetimagethatincludesrt-testsbybuildingthereal-timeimagerecipelikethis:
$bitbakecore-image-rt
IfyouareusingBuildroot,youneedtoaddthepackageBR2_PACKAGE_RT_TESTSinthemenuTargetpackages|Debugging,profilingandbenchmark|rt-tests.
cyclictestmeasuresschedulinglatenciesbycomparingtheactualtimetakenforasleeptotherequestedtime.Iftherewasnolatency,theywouldbethesame,andthereportedlatencywouldbezero.cyclictestassumesatimerresolutionoflessthanonemicrosecond.
Ithasalargenumberofcommand-lineoptions.Tostartwith,youmighttryrunningthiscommandasrootonthetarget:
#cyclictest-l100000-m-n-p99
#/dev/cpu_dma_latencysetto0us
policy:fifo:loadavg:1.141.061.001/49320
T:0(320)P:99I:1000C:100000Min:9Act:13Avg:15Max:134
Theoptionsselectedareasfollows:
-lN:ThisloopsNtimes(thedefaultisunlimited)-m:Thislocksmemorywithmlockall-n:Thisusesclock_nanosleep(2)insteadofnanosleep(2)-pN:Thisusesthereal-timepriorityN
Theresultlineshowsthefollowing,readingfromlefttoright:
T:0:Thiswasthread0,theonlythreadinthisrun.Youcansetthenumberofthreadswithparameter-t.(320):ThiswasPID320.P:99:Theprioritywas99.
I:1000:Theintervalbetweenloopswas1,000microseconds.Youcansettheintervalwithparameter-iN.C:100000:Thefinalloopcountforthisthreadwas100,000.Min:9:Theminimumlatencywas9microseconds.Act:13:Theactuallatencywas13microseconds.Theactuallatencyisthemostrecentlatencymeasurement,whichonlymakessenseifyouarewatchingcyclictestasitruns.Avg:15:Theaveragelatencywas15microseconds.Max:134:Themaximumlatencywas134microseconds.
Thiswasobtainedonanidlesystemrunninganunmodifiedlinux-yoctokernelasaquickdemonstrationofthetool.Tobeofrealuse,youwouldruntestsovera24-hourperiodorlongerwhilerunningaloadrepresentativeofthemaximumyouexpect.
Ofthenumbersproducedbycyclictest,themaximumlatencyisthemostinteresting,butitwouldbenicetogetanideaofthespreadofthevalues.Youcangetthatbyadding-h<N>toobtainahistogramofsamplesthatareuptoNmicrosecondslate.Usingthistechnique,Iobtainedthreetracesforthesametargetboardrunningkernelswithnopreemption,withstandardpreemption,andwithRTpreemptionwhilebeingloadedwithEthernettrafficfromafloodping.Thecommandlinewasasshownhere:
#cyclictest-p99-m-n-l100000-q-h500>cyclictest.data
Then,Iusedgnuplottocreatethethreegraphsthatfollow.Ifyouarecurious,thedatafilesandthegnuplotcommandscriptareinthecodearchive,inMELP/chapter_16/plot.
Thefollowingistheoutputgeneratedwithnopreemption:
Withoutpreemption,mostsamplesarewithin100microsecondsofthedeadline,buttherearesomeoutliersofupto500microseconds,whichisprettymuchwhatyouwouldexpect.
Thisistheoutputgeneratedwithstandardpreemption:
Withpreemption,thesamplesarespreadoutatthelowerend,butthereis
nothingbeyond120microseconds.
HereistheoutputgeneratedwithRTpreemption:
TheRTkernelisaclearwinnerbecauseeverythingistightlybunchedaroundthe20-microsecondmark,andthereisnothinglaterthan35microseconds.
cyclictest,then,isastandardmetricforschedulinglatencies.However,itcannothelpyouidentifyandresolvespecificproblemswithkernellatency.Todothat,youneedFtrace.
UsingFtraceThekernelfunctiontracerhastracerstohelptrackdownkernellatencies—thatiswhatitwasoriginallywrittenfor,afterall.Thesetracerscapturethetracefortheworst-caselatencydetectedduringarun,showingthefunctionsthatcausedthedelay.
Thetracersofinterest,togetherwiththekernelconfigurationparameters,areasfollows:
irqsoff:CONFIG_IRQSOFF_TRACERtracescodethatdisablesinterrupts,recordingtheworstcasepreemptoff:CONFIG_PREEMPT_TRACERissimilartoirqsoff,buttracesthelongesttimethatkernelpreemptionisdisabled(onlyavailableonpreemptiblekernels)preemptirqsoff:combinestheprevioustwotracestorecordthelargesttimeeitherirqsand/orpreemptionisdisabledforwakeup:tracesandrecordsthemaximumlatencythatittakesforthehighest-prioritytasktogetscheduledafterithasbeenwokenupwakeup_rt:thisisthesameaswakeupbutonlyforreal-timethreadswiththeSCHED_FIFO,SCHED_RR,orSCHED_DEADLINEpolicieswakeup_dl:thisisthesamebutonlyfordeadline-scheduledthreadswiththeSCHED_DEADLINEpolicy
BeawarethatrunningFtraceaddsalotoflatency,intheorderoftensofmilliseconds,everytimeitcapturesanewmaximum,whichFtraceitselfcanignore.However,itskewstheresultsofuserspacetracerssuchascyclictest.Inotherwords,ignoretheresultsofcyclictestifyourunitwhilecapturingtraces.
SelectingthetraceristhesameasforthefunctiontracerwelookedatinChapter15,ProfilingandTracing.Hereisanexampleofcapturingatraceforthemaximumperiodwithpreemptiondisabledforaperiodof60seconds:
#echopreemptoff>/sys/kernel/debug/tracing/current_tracer
#echo0>/sys/kernel/debug/tracing/tracing_max_latency
#echo1>/sys/kernel/debug/tracing/tracing_on
#sleep60
#echo0>/sys/kernel/debug/tracing/tracing_on
Theresultingtrace,heavilyedited,lookslikethis:
#cat/sys/kernel/debug/tracing/trace
#tracer:preemptoff
#
#preemptofflatencytracev1.1.5on3.14.19-yocto-standard
#--------------------------------------------------------------------
#latency:1160us,#384/384,CPU#0|(M:preemptVP:0,KP:0,SP:0HP:0)
#-----------------
#|task:init-1(uid:0nice:0policy:0rt_prio:0)
#-----------------
#=>startedat:ip_finish_output
#=>endedat:__local_bh_enable_ip
#
#
#_------=>CPU#
#/_-----=>irqs-off
#|/_----=>need-resched
#||/_---=>hardirq/softirq
#|||/_--=>preempt-depth
#||||/delay
#cmdpid|||||time|caller
#\/|||||\|/
init-10..s.1us+:ip_finish_output
init-10d.s227us+:preempt_count_add<-cpdma_chan_submit
init-10d.s330us+:preempt_count_add<-cpdma_chan_submit
init-10d.s437us+:preempt_count_sub<-cpdma_chan_submit
[...]
init-10d.s21152us+:preempt_count_sub<-__local_bh_enable
init-10d..21155us+:preempt_count_sub<-__local_bh_enable_ip
init-10d..11158us+:__local_bh_enable_ip
init-10d..11162us!:trace_preempt_on<-__local_bh_enable_ip
init-10d..11340us:<stacktrace>
Here,youcanseethatthelongestperiodwithkernelpreemptiondisabledwhilerunningthetracewas1160microseconds.Thissimplefactisavailablebyreading/sys/kernel/debug/tracing/tracing_max_latency,buttheprevioustracegoesfurtherandgivesyouthesequenceofkernelfunctioncallsthatleaduptothatmeasurement.Thecolumnmarkeddelayshowsthepointonthetrailwhereeachfunctionwascalled,endingwiththecalltotrace_preempt_on()at1162us,atwhichpointkernelpreemptionisonceagainenabled.Withthisinformation,youcanlookbackthroughthecallchainand(hopefully)workoutwhetherthisisaproblemornot.
Theothertracersmentionedworkinthesameway.
CombiningcyclictestandFtraceIfcyclictestreportsunexpectedlylonglatencies,youcanusethebreaktraceoptiontoaborttheprogramandtriggerFtracetoobtainmoreinformation.
Youinvokebreaktraceusing-b<N>or--breaktrace=<N>,whereNisthenumberofmicrosecondsoflatencythatwilltriggerthetrace.YouselecttheFtracetracerusing-T[tracername]oroneofthefollowing:
-C:Contextswitch-E:Event-f:Function-w:Wakeup-W:Wakeup-RT
Forexample,thiswilltriggertheFtracefunctiontracerwhenalatencygreaterthan100microsecondsismeasured:
#cyclictest-a-t-n-p99-f-b100
FurtherreadingThefollowingresourceshavefurtherinformationaboutthetopicsintroducedinthischapter:
HardReal-TimeComputingSystems:PredictableSchedulingAlgorithmsandApplicationsbyButtazzo,Giorgio,Springer,2011MulticoreApplicationProgrammingbyDarrylGove,AddisonWesley,2011
SummaryThetermreal-timeismeaninglessunlessyouqualifyitwithadeadlineandanacceptablemissrate.Whenyouhavethesetwopiecesofinformation,youcandeterminewhetherornotLinuxisasuitablecandidatefortheoperatingsystemand,ifso,begintotuneyoursystemtomeettherequirements.TuningLinuxandyourapplicationtohandlereal-timeeventsmeansmakingitmoredeterministicsothatthereal-timethreadscanmeettheirdeadlinesreliably.Determinismusuallycomesatthepriceoftotalthroughput,soareal-timesystemisnotgoingtobeabletoprocessasmuchdataasanonreal-timesystem.
ItisnotpossibletoprovidemathematicalproofthatacomplexoperatingsystemsuchasLinuxwillalwaysmeetagivendeadline,sotheonlyapproachisthroughextensivetestingusingtoolssuchascyclictestandFtraceand,moreimportantly,usingyourownbenchmarksforyourownapplication.
Toimprovedeterminism,youneedtoconsiderboththeapplicationandthekernel.Whenwritingreal-timeapplications,youshouldfollowtheguidelinesgiveninthischapteraboutscheduling,locking,andmemory.
Thekernelhasalargeimpactonthedeterminismofyoursystem.Thankfully,therehasbeenalotofworkonthisovertheyears.Enablingkernelpreemptionisagoodfirststep.Ifyoustillfindthatitismissingdeadlinesmoreoftenthanyouwouldlike,thenyoumightwanttoconsiderthePREEMPT_RTkernelpatches.Theycancertainlyproducelowlatencies,butthefactthattheyarenotinthemainlineyetmeansthatyoumayhaveproblemsintegratingthemwiththevendorkernelforyourparticularboard.Youmayinstead,orinaddition,needtoembarkontheexerciseoffindingthecauseofthelatenciesusingFtraceandsimilartools.
ThatbringsmetotheendofthisdissectionofembeddedLinux.Beinganengineerofembeddedsystemsrequiresaverywiderangeofskills,whichincludealow-levelknowledgeofhardwareandhowthekernelinteractswithit.Youneedtobeanexcellentsystemengineerwhoisabletoconfigureuserapplicationsandtunethemtoworkinanefficientmanner.Allofthishastobedonewithhardwarethatis,often,onlyjustcapableofthetask.Thereisa
quotationthatsumsthisup:Anengineercandoforadollarwhatanyoneelsecandofortwo.IhopethatyouwillbeabletoachievethiswiththeinformationIhavepresentedduringthecourseofthisbook.
ThisbookwasdownloadedfromAvaxHome!
Visitmyblogformorenewbooks:
www.avxhm.se/blogs/AlenMiler