CyberinfrastructureUser Support
AndrewShermanYaleUniversity
SeniorResearchScientist,YaleCenterforResearchComputingSeniorResearchScientist,DepartmentofComputerScience
ACI-REFVirtualResidency2016ThuAugust11,2016
Goalsforthissession
n WhatisCI,andhowdoesitdifferfromconventionalIT?n CIusercategories,andhowtosupportthemn SomeofthehumanaspectsofCIsupport(i.e.politics,conflicts)n Policies,education,outreach,collaborations,andnetworking
TheseslidesarebasedonmaterialfromMehmet(Memo)Belgin (GATech),modifiedbyHenryNeeman,andareusedwithpermission.Numerouseditshavebeenmade.
ACI-REFVirtualResidency2016,ThuAugust11,2016 2
YaleCenterforResearchComputing
n Free-standingcenterreportingtoDeputyProvostforResearch(dottedlinestothemedicalschoolandITS);createdinJuly2015
n Whoweare(~15FTEs)n 2FacultyDirectors(Arts&Sciences;MedicalSchool)n ExecutiveDirectorn ACI-REFs(6+):2researchfaculty;5+others;alignedtospecificclustersn HPCEngineering/SystemAdministrationTeam(6)n DirectorofResearchServices(education,communications)
n Whowearen’t (ITS)n DesktoporLabSupportn CampusNetworkOperations(ScienceNetwork&DMZisshared)n DataCenterOperations(power,etc.)n Security&AuthenticationServices
ACI-REFVirtualResidency2016,ThuAugust11,2016 3
YCRCResponsibilities
ACI-REFVirtualResidency2016,ThuAugust11,2016 4
Cyberinfrastructure• 5HPCclusters(~17K cores??)• HPCdatastorage(~8PB)• Researchdatamanagement
• Integrationwithcampus-wide“Storage@Yale”active&archivetiers
• Someintegrationwithlabandinstrumentationstorage
• ScienceNetwork&DMZ
Research&TeachingSupport• Dedicatedsupport (YCGA,G&G)• HPCsoftware&algorithm
installations,tuning&consultation• Supportforscience&engineering
softwareapplications• Nationalinfrastructureassistance• Grantpreparation• Facultyrecruitment(startuppkgs)• HPCsupportforclasses
Education&Training• ParallelComputing(creditclass)• Research ComputingWorkshops
• GettingStartedBootcamps• Python,ParallelR,GIS• Group/Dept.Bootcamps• XSEDE&vendorworkshops
• Usergroups
OutsideCommunity• CASC(http://www.casc.org)
• Working groupson”beyondhardware”andregulateddata
• XSEDECampusChampions(2)• ACI-REF (CaRC);ACI-REF-VR• NortheastBigData Hub• LCI
WhattheHeckisCyberInfrastructure (CI),Anyway?
n Componentsn Computingsystemsn Datastoragesystemsn Advancedinstrumentsanddatarepositoriesn Visualizationenvironmentsn HighSpeedNetworksn People
n Purposen Enablescholarlyinnovationanddiscoveriesnototherwisepossible
ACI-REFVirtualResidency2016,ThuAugust11,2016 5
BasedonIndianaUniversity’sdefinition
DifferencesbetweenCIandConventionalIT
n Primarytargetisperformancen UsuallyreliesonconventionalITservices(byaseparateteam)n Morefocusonsupportingend-usersthanservicesn UsescommonITtechnologiesinuncommonwaysn Maymixsharedanddedicatedresourcesinoneentityn Requiresspecificmiddlewareandsoftwarelayersn Requirescodecompilationsusingcomplicatedmechanismsn Mayrequirespecificknowledgeabouttheapplication/sciencen Hasirregularusagepatterns,whichmaybecomeobviousand
troublingtousers
ACI-REFVirtualResidency2016,ThuAugust11,2016 6
Outline
n PartI:CIuserexpectations,categorizationandcommonalitiesn PartII:Policies,Politics,ConflictsandPersonalityManagementn PartIII:Education,Outreach,andNetworking
ACI-REFVirtualResidency2016,ThuAugust11,2016 7
Faculty(a/k/aPrincipalInvestigator)Expectations
n TypicalRolesn Researchentrepreneur&teachern ManagerandfunderofCIusers
n OftenknowledgeableaboutCIbutdoesn’tuseitdirectly(thatpleasureisreservedforstudents&postdocs!)
n Mayownorpayforresourcesandservices(butsharedresourcesmaybefreeatsomeinstitutions)
n Expectations:n CIresourcesarereliablyupandrunningon7x24basisn Studentsandcollaboratorshavefair(?)accesstoCIresources
requiredtocarryoutresearchorclassroomassignmentsontimen Assistanceavailableasandwhenneededn Regularusageandexpensereports(especiallyforstorage)
ACI-REFVirtualResidency2016,ThuAugust11,2016 8
“ActualCIUser”Expectations
n TypicalRolesn Some“handson”facultyn Usuallystudents,postdocs,orotherswhoarenotpermanentn Permanentresearchstafforresearchfacultyn Externalcollaborators
n Expectationsn 7x24accesstoCIresources(andshortjobwaittimes,ofcourse)n “Insider”relationshiptoCIstaffforadvancedusersn Ultra-fastlearningcurven Simpleandinstantsolutionstocomplexproblemsn Applicationsrunningmuchfasterthanontheirdesktops(notalways
possible!)n Helpdiagnosing/fixingproblemsthatmaybeexternallycontrolledn Answersthatmatchtheirlevelofknowledge
ACI-REFVirtualResidency2016,ThuAugust11,2016 9
CIUserCategories
n Threebroadcategories:n Novicen Intermediaten Advanced
n Difficulttoidentifyauser'scategorywithoutanypriorinteractionn Thelanguageusedinrequestsisagoodindicatorn Repliestofollow-upquestionsalsorevealthelevelofproficiencyn Ifuncertain,assume“novice”(butdon’tmakeitobvious!)
ACI-REFVirtualResidency2016,ThuAugust11,2016 10
Category1:NoviceUsers
n Characteristicsn LittleexperiencewithLinuxorcommand-lineenvironmentsn MayuseMatlab,Mathematica,andsometimesR(orevenExcel)n MayhavelimitedknowledgeofascriptinglanguagelikePythonn Rarelyanyinklingaboutparallelism
n Generateupto40-50%ofsupportrequests.Commonexamples:n Desktopsetup(especiallyforWindows)n Loginprocedures(ssh keys,two-factorauthentication,etc.)n Findingsoftwareonthecluster(s)n Findinghelpanddocumentation
n Mostrequestsarestraightforward,butsome“simple-sounding”onesmaytakealotofwork(orbeimpossible)
ACI-REFVirtualResidency2016,ThuAugust11,2016 11
SupportActivitiesforNoviceUsers
n Up-to-datewebsitewithreasonabledocumentationfornovicesn Getting-startedpresentationoron-linetutorial(possibly
customizedfortheuser’sdesktopOS)n Linux101workshopwithsoftwaresuggestions(e.g.,easyeditor)n Friendlyticketsystemforrequests,questions,andassistancen Walk-inofficehoursn Makeiteasytofindsoftware,manageenvironment&runjobs
n ToolslikeLmodn Cross-clusterstandardizationofenvironment,jobscheduler,etc.n Provideannotatedtemplatesubmissionscripts
n Softwareinstallationassistancen Helpwithtoolstomovedatato/fromclusters
ACI-REFVirtualResidency2016,ThuAugust11,2016 12
Category2:IntermediateUsers
n Characteristicsn HavepriorLinuxclusterexperience;cancreatejobscripts,butmay
notunderstandsystem-wideimpactoftheiractionsn VaryingdegreesofproficiencyinPython,C,Fortran,R,etc.n Useworkflowsinvolvingmultipledomain-specificpackagesn OftennoticeandreportHWorsystemproblemsn Mayusewebsearchtotrytoovercomedifficulties
n Generateupto30-40%ofsupportrequests.Commonexamples:n Assistancewithcomplexsoftwareinstallationsn Assistancewithperformanceissuesn Helpwithcomplexjobscripts,jobarrays,orparameterstudiesn Specialrequests(“bendingtherules”),suchasjobpriorityorquota
ACI-REFVirtualResidency2016,ThuAugust11,2016 13
EffectiveSupportforIntermediateUsersn “Teachthemtofish”:Offeradvanced,possiblydomain-specific,
workshops;takeadvantageofXSEDEorvendorofferings;SoftwareCarpentryorDataCarpentrymaybevaluableforsomeusers
n Buildstrongindividualworkingrelationshipssincetheseusersoftenserveaslocaltrainers&“experts”fortheirgroups.
n Betransparentindiscussions,sincetheycandistinguishfactfromspeculation(andwillprobablyputyouradvicetothetest).
n Admitwhenyoudon’tknowsomething.Youaren’texpectedtoknoweverything!Butthentrytofindoutandfollowup!(Network!)
n Helpthemfindsolid,high-qualityon-lineinformation(vendorsites,userforums,etc.)pitchedattheproperlevel.
n Assistordocomplexsoftwareinstallations,especiallythoseinvolvingparallelcodesorsignificantoptimizations.Helpwithcodedevelopment/debugging/tuningmaypaybigdividendslater.
ACI-REFVirtualResidency2016,ThuAugust11,2016 14
Category3:AdvancedUsers
n Characteristicsn Maybehands-onfaculty,researchstaff,oradvancedstudentsn Experiencewithandaccesstomultipleclusters(includingXSEDE,etc.)n Technicallyproficientinscriptingorprogramminglanguagesn Developand/oruseparallelapplicationsn Developcomplexworkflowsandjobscriptsn Alwaystryingnewthings;willingtoexperimentwithnewsoftware
n Generateupto10-15%ofsupportrequests.Commonexamples:n Installationofcomplexsoftware&tools(“It’sjust1Pythonmodule!”)n RequestsborderingonR&Dn Specialrequests/treatment(oftenoutsideofnormalchannels)n Helpwithspecialhardware(e.g.,GPUs)n Bugsfoundinhardware,3rd partyapplications,orlibraries
ACI-REFVirtualResidency2016,ThuAugust11,2016 15
EffectiveSupportforAdvancedUsersn Applyallsupporttechniquesforintermediateusershere,too.n Communicateandmeetregularlywiththem.Happyadvanced
usersandtheirfacultyadvisors/PIsmayoftenbeyourstrongestadvocatesatyourinstitution.
n Treatadvancedusersaspeers;theymayknowasmuchormorethanyoudoaboutresearchcomputing.
n Asappropriate,involvetheminhardwareacquisitionsandACIgrantproposals.
n Collaborate!ResolvingmanyofthecomplexproblemstheyencountermayrequireclosecooperationamongACI-REFs,systemadministrators,andothers.
n Beflexible.Makesmallrulesexceptionswhentheywon’timpactothers.However,watchoutforslipperyslopes.
ACI-REFVirtualResidency2016,ThuAugust11,2016 16
Outline
n PartI:CIuserexpectations,categorizationandcommonalitiesn PartII:Policies,Politics,ConflictsandPersonalityManagementn PartIII:Education,Outreach,andNetworking
ACI-REFVirtualResidency2016,ThuAugust11,2016 17
Policiesn Havewell-definedwrittenpolicies.Theseseteveryone’s
expectationsandavoidmisunderstandings.n Publishpoliciesinplaceseasytofind(online).RequirePIsto
acceptyourpoliciesandmakePIsresponsibleforthebehavioroftheirstudents,postdocs,andstaff.
n Bepreparedtoexplainthereasoningbehindeachpolicyitem.n Makepoliciesstrict(conservative),butconsiderexceptionsas
needed(butavoidslipperyslopes!)n Encourageuserstoopenlydiscussandcriticizethepolicies.n Don’thesitatetoupdatepoliciestostayrelevant.n Buildtrustandeffectivecommunicationwithdecisionmakers.n Seekdelegationprivilegestospeedthingsup.n Influence,butdon’tmake,policiesforresourcesyoudon’town.
ACI-REFVirtualResidency2016,ThuAugust11,2016 18
ScheduledMaintenance
n Setregularschedule,withmultipleadvanceannouncements.n Unscheduleddowntimesarenoexcuseforskippingmaintenancen Provideasummaryofcompletedtasksaftermaintenance.n Havecleargoals;planaheadingreatdetail:
n Workwithyourvendorsn Teammember/taskassociationsn Estimatedtaskdurationn Criticalpathsandfallbackplans
n Prepareforpotentialproblemsduring/aftermaintenancedaysn Showbesteffortforminimalimpact
n Configuretheschedulertohavenorunningjobsn Disableuseraccesstoresourcesduringthemaintenanceactivitiesn Assistusersinmovingworktoalternativeclusterswhenpossible
ACI-REFVirtualResidency2016,ThuAugust11,2016 19
PoliticsandConflicts
n Trickybutinevitablen Nomagicformula,needcase-specificcreativesolutionsn Biggestchallenge:conflictsduetolimitedresources
n Configuresystemstomatchyourpolicies.n Collectandstoredataforpastandpresentusage.n Provideuserswithtoolstobrowsedata/statisticsfortheiraccounts.n Runregularauditstodefuseproblemsbeforetheyexplode.n Considerascavengequeueforpre-emptiblejobs
ACI-REFVirtualResidency2016,ThuAugust11,2016 20
TiersofConflict
n Internaltoagroup/department:Usuallyeasiertosolvewithcommunicationandinformalagreements.Sometimesagoodjobschedulercanhelp(e.g.,multi-levelfairshare).Provideadvice,butgetthePIorchairtotaketheleadandowntheresolution.
n Betweengroups/departments:Cangetmessy,butmaybeavoidableifyousticktoyourpolicies.Beeven-handed;don’tshowfavoritism.Getallagreementsinwriting!
n BetweenusersandCIsupportstaff:Haveclearpolicieshandyasabasisfordecliningunreasonableorimpossiblerequests,andkeepsolidstatistics/dataasevidence.Asabove,beeven-handed;don’tshowfavoritism.Getallagreementsinwriting!
ACI-REFVirtualResidency2016,ThuAugust11,2016 21
PersonalityManagement
n Someusersaremoredifficultthanothers.That’slife!n Don’ttakethingspersonally;reportharassment;neverretaliaten Usersdon’tmeantobedifficult;butmaybeundergreatpressure
andextremelyfrustratedn Ifyoumakeamistake,takeresponsibilityandofferanapology.n Showempathyandsincerityn Acknowledgethat:
n youunderstandtheuser’sconcerns;n youareawareofitsparticularimpactontheuser.
n Besensitivetoculturaldifferencesandlanguagedifficulties.n Usehumorappropriately,andavoidbeingawkwardorinsulting.n Communicatefrequentlywhileworkingonanyissue
ACI-REFVirtualResidency2016,ThuAugust11,2016 22
Outline
n PartI:CIuserexpectations,categorizationandcommonalitiesn PartII:Policies,Politics,ConflictsandPersonalityManagementn PartIII:Education,Outreach,andNetworking
ACI-REFVirtualResidency2016,ThuAugust11,2016 23
TrainingsandTutorials
n ResearchComputingWorkshopsn GettingStartedBootcampsn Python,ParallelR,GISn Group/Dept.Bootcampsn XSEDE&vendorworkshopsn SoftwareCarpentry;DataCarpentry;SCTutorials&Workshops
n SpecialTopicsn ParallelComputingn Debugging/optimizationofcodes(includingparallel)n Systemarchitecturespecificdetailsn Advanceduseofcommontools(ScientificPython,ParallelMATLAB)
ACI-REFVirtualResidency2016,ThuAugust11,2016 24
GroupConsultations
n Mini-orientationsfornewgroups(“On-Boarding”)n Usegroupmeetingsforfeedback&toresolveinternalconflictsn Resolutionoftechnicalproblemsthatarespecifictoagroupn Technicalfeedbacktoassistinpolicymakingandsystempurchasesn Introduceservicestonewgroupsinterestedingettingresources
ACI-REFVirtualResidency2016,ThuAugust11,2016 25
CollaborationswithResearchersandVendors
n Researchershelpingresearchersn Crucialforstayingrelevant:Whatisyourfacultyplanning?n Collaborativegrantwritingn Collaborativeprojects/papers(acknowledgementsorco-authors)n Supportforclassesandworkshopsn Developer/vendorcollaborations
n Bugtrackingandfixesn HW/SWinformation,evaluationofnewsystemsandtechnologyn Pilotstudies&benchmarks
ACI-REFVirtualResidency2016,ThuAugust11,2016 26
SomeExternalGroupsforStaffTraining&Networking
n ACI-REF;ACI-REF-VR;CaRCn XSEDECampusChampions(national®ional)n CASC(http://www.casc.org)
n Workinggroupson”beyondhardware”andregulateddatan Educausen LCI(aimedatHPCsystemadministration)
ACI-REFVirtualResidency2016,ThuAugust11,2016 27