Post on 20-Feb-2017
transcript
Copyright©2015SplunkInc.
Hands-OnBuildingBusinessServiceIntelligencewithITServiceIntelligence(ITSI)
TomHarrop-SalesSupportSpecialistDavidMillis-StaffArchitect
SetupBeforeYouCanPlay1.Downloadthispresenta=onslidedeck:h@ps://splunk.box.com/v/ITSI-HandsOn-PaloAlto
2.Ifyouhavenotdonesoalready,SignupfortheFREESplunkITSIOnlineSandbox:• hQp://splunk.com/itsi• Select"FreeOnlineSandbox"
3.Pleasetestaccesstoyoursandbox;• Chrome,Firefox,Safari
arerecommended;• IEisNOTrecommended
4.AXerloggingin,selectITServiceIntelligencefromthelistofappsattheleX
2
WhatisaService?
ServiceRequestsResponses
InITSI,aServiceisalogicalgroupoftechnologycomponentsthatauserdeemsneedtobemonitoredtogether.
ItcanoXenbegeneralizedasa“blackbox”whichwesendrequests,andexpectresponses
4
WhatisaService?
DNS RequestsResponses
TechnicalServices
AuthRequestsResponses
WebRequestsResponses
Servicescanbelowerlevel(technical)…
5
WhatisaService?
DNS RequestsResponses
TechnicalServices
CustomerTransac=ons
RequestsResponses
BusinessServices
AuthRequestsResponses
WebRequestsResponses
SupportDesk RequestsResponses
Servicescanalsobehigherlevel(business)…
6
WhatisaService?
PacketNetwork
HypervisorandHosts
RBMDBs
StorageTier
APIServices
WebServices
CustomerTransac=ons
MobileAPI/
Middlew
are
PartnerPortal
DNS
ServicescanencompassmuldpledersoftheITdomain.Servicesmayalsodependuponotherservices
7
WhatisaKPI?
DNS RequestsResponses
KPI:NumberofrequestsKPI:ErrorrateKPI:AverageresponsedmeKPI:ServerCPUloadKPI:ServernetworkI/Ferrors
CustomerTransac=ons
RequestsResponses
KPI:NumberoftransacdonsKPI:ErrorrateKPI:AverageresponsedmeKPI:CountofIncidentTicketsKPI:SynthedcTransxHealth
KPIsandHealthscoresconsdtutethemeansbywhichServicesaremonitored.
8
KeyPerformanceIndicators(KPIs)
9
AKeyPerformanceIndicator(KPI)isaSplunksavedsearchcreatedwithintheITSIUIthathelpsmonitoraspecificfieldlikeCPU,Memory,NumberofErrors
andsoon.KPIsarecontainedwithinServices.
ServiceHealthScores
10
AHealthscoreisascoreform0-100(0beingcridcaland100beingnormal)thathelpsdeterminethehealthofaService.ItiscalculatedbasedonallKPIs
importanceanditsstatus(e.g.green,orange,red),onceeveryminute.
ServiceDecomposidoninITSI
16
1-Whatisahigh-valuebusinessservice?(OnlineStore)
2-Processflow,andunderlyingsub-services?(Web->Middleware->DB->Middleware->Web)
ServiceDecomposidoninITSI
17
1-Whatisahigh-valuebusinessservice?(OnlineStore)
2-Processflow,andunderlyingsub-services?(Web->Middleware…)
3-Foreach(sub)service:KPIstoshowhealth&status?(Database:errors,SQLhits,responsedme,…)
ServiceDecomposidoninITSI
18
1-Whatisahigh-valuebusinessservice?(OnlineStore)
2-Processflow&underlyingsub-services?(Web->Middleware…)
3-Foreach(sub)service:KPIs?(Database:errors,SQLhits,…)
4-ForeachKPI:NeedaSplunksearch(index=DB(warn*ORerror*)|statscount)
ServiceDecomposidoninITSI
19
1-Whatisahigh-valuebusinessservice?(OnlineStore)
2-Processflow&underlyingsub-services?(Web->Middleware…)
3-Foreach(sub)service:KPIs?(Database:errors,SQLhits,…)
4-ForeachKPI:NeedaSplunksearch(index=DB(warn*ORerror*)|statscount)
NewRequirements!
22
● CreateanewKPIfortheDBService:● NetworkUdlizadon
● ModifytheExecudveGlassTableinordertoshowofftheservicesyouslaveover
“WEonlyhaveabout15minTODOWHAT???!!???”Thinkabouthowlongthiswouldtakeyoutoday?
Let’sTalkEnddes
24
● SelectDBService
● Enddesaretherelevantthingswhichsupportthisservice(usuallyhosts)
● Selecttherightentrieswithfilters,ANDs,ORs● OriginalEndtylistcancomefromCMDB,
spreadsheet,Splunksearch,others
AKPIin5minutes?Absolutely!
25
ClickNew–GenericKPI
SelectDataModel● HostOperaNngSystem● Network● #bytes● Next
KPIsCondnued….
26
SplunkBuildsSearchesforyou–OhYeah,that’shappeningJ
● SelectYesforSplitby&FilteropNons● SelecthostforEnAtyLookup&AliasopNons● ClickNext
AlmostThere…
27
Select● KPISearchSchedule:EveryMinute● EnAtyCalculaAon:Average● Service/AggCalculaAon:Average● CalculaAonWindow:LastMinute● ClickNext
● Unit:Bps● ClickNext
FinalSteps…
28
Setyourthresholds:● Aggregate(All)● PerEnAty
● Click“AddThreshold”TWICE● MaketheNeapolitanicecreamcolors
Yellow,Green,Yellow● Dragtheslidersaroundinordertoget
thecurrentdatagraphendrelyinsidetheGreen(normal)band
● ClickFinish● Otheropdonsarealsoavailable,
includingadapdvethresholdsandanomalydetecdon
AnomalyDetecdon
32
● MachineLearning
● WorkswellfordatawithpaQerns
● Requiressome“training”(trial&error)tozeroinonbestsensidvity
● Moresophisdcatedcapabilidescoming!(muldvariate,morealgorithms,etc)
NamethatKPI!
33
FromthelistofKPIs,selectyournewone(atthebo@om)● ClickontheliPlepencilnexttothename● Callit“NetworkUAlizaAon”,
withyourusernameupfront
● ClickonSaveatboQomrightwhenfinished!
ClonetheGlassTable
35
ReturntoSavedGlassTablespage(clickonGlassTablesintheuppermenubar)CLICKEditfor“BuQercupGamesBusinessProcess(INPROGRESS)”• SelectClone• Title:Addyourusername
tothefront• Permissions:SharedinApp• ClickClonePage• ClickonyournewGlassTable
fromthelist,toviewit
Edit&HaveFun!
36
ClickonEditintheupperrightcornerofyourGlassTableUsethe“Services”panelontheleXtoselectIndividualKPIs,orAggregateServiceHealthScores• Choose2KPIsfromOnlineStorethatwouldbeusefulin
the“OrderProcess”secdon• Dragtheselectedwidgetsontothecanvas,posidoningin
thegrayoval• What’sthedifferencebetweenthe
andtoolsatthetopleX?
MoreFunwiththeGlassTableEditor…
37
UsetheConfiguraAonspanelontherighttoeditaselectedwidget• Canchangethevisualizadontype,drilldown
behavior,andothersewngs• YoushouldhitSavefrequently• IwonderwhatAutoLayoutdoes?• (YIKES!)RevertAllChangesmightbehelpful
Finishingup…
38
• AddaServiceHealthScorewidgetforOnlineStoreunderBuQercup
• ChooseaVizTypewithasparklinegraph,thenresizetomakeitlookpreQy
• ModifytheCustomDrilldownacdontogotothesavedglasstable,BuPercupGamesOnlineStore
• BonusPoints:Makethelabelbigger,morereadable
• ClickSave• Viewwhendone
ATroubleshoodngExercise
39
Let’suseITSItotroubleshootanoutage● StartatyourGlassTable,“<UserName>BuQercupBusinessProcess”● CustomerCarereportsthatunhappycustomersarecomplainingoffailures
andlongdelayswhentryingtopurchase● Thecallsbegancominginataroundthetopofthelasthour.● IntheupperrightcorneroftheGlassTable,changethedmepickerfromNow
toXX:00:00.0,whereXXistheprevioushour.Forexample,ifitiscurrently14:05,setthedmepickerto13:00:00.0,thenApply
● Thisishowwecan“dmetravel”backtoseecondidonsatapardcularoutage–ohyeah!
ATroubleshoodngExercise,cont’d
40
● TheOnlineStoreseemstobedegraded,justasCustomerCarereported.ClickonthewidgetunderBuQercuptodrilldownfurther
ATroubleshoodngExercise,cont’d.
41
● TheOnlineStoreGlassTableshowsamuchmoredetailedview,includingtheimpactedcustomer-facingKPIsatthefarleX(Revenue,etc)
● Basedonthisviewofalltherelevantservices,wheredoyouthinktherootcauselies?
● Whichserviceshouldwetroubleshootfirst?● ClickonHealthwidgetforthatservice,to
drilldowntoaDeepDive
DeepDive
42
● DeepDiveshowsmuldpleKPIsandHealthScoresinparallel“swimlanes”.
● TheHealthScoreforthisServiceisthetopswimlane.Canyouseewhenitbeginstodegradefrom100%?
● Mousingoverthispointindme,canyouspottheKPIwiththeleadingfaultindicadon,i.e.,whatfailedfirst?
● Toimprovereadability,makesurethePrimaryTimeRange(lowerleXcorner)issettoPresets>Last60minutes
Muld-KPIAlertsandNotableEvents
43
● ClickonNotableEventsReview● MuldpleKPIsandHealthscorescan
becombinedinsophisdcatedwaystocreateMuld-KPIalerts
● WhenaMuld-KPIalertfires,oneoftheoutcomesisthecreadonofaNotableEvent
● NotableEventsallowNOCpersonnelandotherstotriageandcoordinateeventmanagementefforts
ServiceAnalyzer
44
● ClickonServiceAnalyzer>DefaultServiceAnalyzer
● Backwherewestarted!● Thisviewshowsa“no-frills”listof
services(top)andhoQestKPIs(boQom)
● ProvidesaquickjumpingoffpointintoDeepDivesandtheNotableEventsReview
● ItisusefulforNOCsandotherswhoneedahigh-levelsituadonalview
Review
46
● High-valueservicescanbedecomposedandmodeledinITSI,usingmachinedatafromtherelevantsystems
● ServicesandKPIscanbecreatedinminutes,withsophisdcatedthresholdingtechniquestodisdnguish“normal”from“notnormal”
● GlassTablesallowservicehealthandKPImetricstobedisplayedinawaythatmakessensetospecificgroups,suchasExecudveLeadership,BusinessServiceOwners,theNOC,DevOps&Others
● DeepDivesallowKPIstobecomparedside-by-sideacrossanydmerange,acceleradngrootcauseanalysisandsignificantlyreducingMTTR
● Mul=-KPIAlertsandNotableEventsreducealertnoise,producingacdonableeventsandameanstomanagethem
● …andit’sfuntobuild!
ThanksforComing!
47
● Pleasetakeoursurvey:hQps://www.surveymonkey.com/r/8QHPSHX
● Letusknowifyouhavequesdons,orareinterestedinpursuingaGlassTable
Exerciseforyourorganizadon
● ITSIGuidebook:FromyourITSIinstance:
● Search->Dashboards->ITSISandboxGuide
DavidMillisdmillis@splunk.com
TomHarroptharrop@splunk.com
48
SEPT26-29,2016WALTDISNEYWORLD,ORLANDOSWANANDDOLPHINRESORTS
• 5000+IT&BusinessProfessionals• 3daysoftechnicalcontent• 165+sessions• 80+CustomerSpeakers• 35+AppsinSplunkAppsShowcase• 75+TechnologyPartners• 1:1networking:AskTheExpertsandSecurityExperts,BirdsofaFeatherandChalkTalks
• NEWhands-onlabs!• Expandedshowfloor,DashboardsControlRoom&Clinic,andMORE!
The7thAnnualSplunkWorldwideUsers’Conference
PLUSSplunkUniversity• Threedays:Sept24-26,2016• GetSplunkCerdfiedforFREE!• GetCPEcreditsforCISSP,CAP,SSCP• SavethousandsonSplunkeducadon!