Machine Learning + Analytics

Post on 15-Feb-2017

154 views 1 download

transcript

Copyright©2016SplunkInc.

OperationalizingMachineLearningBrianNashSplunkArchitect;BusinessAnalytics,ML&IoT

2

DisclaimerDuringthecourseofthispresentation,wemaymakeforwardlookingstatementsregardingfuture

eventsortheexpectedperformanceofthecompany.Wecautionyouthatsuchstatementsreflectourcurrentexpectationsandestimatesbasedonfactorscurrentlyknowntousandthatactualeventsorresultscoulddiffermaterially.Forimportantfactorsthatmaycauseactualresultstodifferfromthose

containedinourforward-lookingstatements,pleasereviewourfilingswiththeSEC.Theforward-lookingstatementsmadeinthethispresentationarebeingmadeasofthetimeanddateofitslivepresentation.Ifreviewedafteritslivepresentation,thispresentationmaynotcontaincurrentoraccurateinformation.

Wedonotassumeanyobligationtoupdateanyforwardlookingstatementswemaymake.

Inaddition,anyinformationaboutourroadmapoutlinesourgeneralproductdirectionandissubjecttochangeatanytimewithoutnotice.Itisforinformationalpurposesonlyandshallnot,beincorporatedintoanycontractorothercommitment.Splunkundertakesnoobligationeithertodevelopthefeatures

orfunctionalitydescribedortoincludeanysuchfeatureorfunctionalityinafuturerelease.

Copyright©2016SplunkInc.

WhyMachineLearning?

4

Humans are good at learning, but we get lost

in volume and details…

5

WhydoweneedMachineLearning?

- ImproveDecisionMaking- ForecastorPredictKPIs- AlertonDeviation- Uncoverhiddentrendsor

relationships

AllofthisrequiresDiverseDatafromacrossManySilos.LotsofUnstructured,RealTimeData.

6

RuntheBusinessinReal-time

DataFromthePast Real-timeData StatisticalForecastT– afewdays T+afewdays

SecurityOperationsCenter

ITOperationsCenter

BusinessOperationsCenter

Predictive(Models)

Descriptive(BITools,DataLakes) Greyspace

Copyright©2016SplunkInc.

WhatisMachineLearning?

8

ML 101: What is Machine Learning?What: “Field of study that gives computers the ability to learn

without being explicitly programmed” – A. Samuel, 1959How: Generalizing (learning) from examples (data)

Simple ML workflow:– EXPLORE data– FIT models based on data– APPLY models in production– VALIDATE models– REPEAT

9

Learning From Data[Prediction]• When we see thick clouds and an overcast sky, we

predict that it’s likely going to rain

[Estimation/ Regression]• Estimate how much an apartment costs based on its

location, condition and prices of properties in that neighborhood

[Classification/ Clustering]• Determine the gender of a person based on her/his

features, hair style and the way s/he dresses

[Anomaly Detection] • Identify the odd one out

[Reinforcement Learning]• If I made a mistake this time, can I do better next time?

Allofushavehadsomeexperienceinlearning.But…what’sbehindourexperience?Howdowetranslatethatknowledgetocode?

10

Major Types of Machine Learning1. Supervised Learning: generalizing from labeled data

11

Major Types of Machine Learning2.Unsupervised Learning:generalizingfromunlabeled data?

12

3. Reinforcement Learning: • System is rewarded (or punished) based on the outcomes it generates• Action leads to a change in the state of the world and generates an error score

Major Types of Machine Learning

Copyright©2016SplunkInc.

MachineLearningWithSplunk

14

OverviewofMLatSplunk

CorePlatformSearch PackagedPremiumSolutions CustomML

PlatformforOperationalIntelligence

15

SplunkITServiceIntelligence

GetData Defineservices,entitiesandKPIs

Monitorandtroubleshoot

Analyzeanddetect

Data-Defined,Data-DrivenServiceInsights

PackagedML:AdaptiveThresholdsandAnomalyDetection

OneofseveralPremiumSolutions

16

SplunkMachineLearningToolkit

Assistants: Guidemodelbuilding,testing,&deployingforcommonobjectivesShowcases: InteractiveexamplesfortypicalIT,security,business,IoTusecases

Algorithms: 25+standardalgorithmsavailableprepackagedwiththetoolkitSPLMLCommands:Newcommandstofit,testandoperationalizemodelsPythonforScientificComputingLibrary:300+opensourcealgorithmsavailableforuse

Buildcustomanalyticsforanyusecase

ExtendsSplunkplatformfunctionsandprovidesaguidedmodelingenvironment

17

Algorithmssupported(v2.0)

ITSI,UBA

DomainExpertise(IT,Security,…)

DataScienceExpertise

SplunkExpertise

CustomMachineLearning– SuccessFormula

Identifyusecases

Drivedecisions

Setbusiness/opspriorities

SPL

Dataprep

Statistics/mathbackground

Algorithmselection

Modelbuilding

SplunkMLToolkitfacilitatesandsimplifiesviaexamples&guidance

Operationalsuccess

19

Summary:TheMLProcessProblem:<Stuffintheworld>causesbigtime&moneyexpense.ValueHypothesisSolution:BuildMLmodeltoforecast<possibleincidents>,actpre-emptively&learn

Ope

ratio

nalize

1. Getalltherellevant datatotheproblem;Explore thedata

2. SelectandFitanalgorithem onthedata,generatingamodel

3. Apply &Validatemodelsuntilpredictionssolvetheproblem

4. SurfacethemodeltoXOps,whoconsumethemodeltosolvetheproblem

20

MachineLearningProcesswithSplunk

20

CollectData

Explore/Visualize

Model

Evaluate

Clean/Transform

Publish/Deploy

props.conf,transforms.conf,DatamodelsAdd-onsfromSplunkbase,etc.

Pivot,TableUI,SPLMLToolkit

Alerts,Dashboards,Reports

Copyright©2016SplunkInc.

SplunkArchitecture&ML

22

ContinuousDataIngestatScale

DevelopVisualize PredictAlertSearch

Engineers DataAnalysts

SecurityAnalysts

BusinessUsers

NativeInputsTCP,UDP,Logs,Scripts,Wire,Mobile

IndustrialData

SCADA,AMI,MeterReads

ModularInputsMQTT,AMQP,COAP,REST,JMS

HTTPEventCollectorTokenAuthenticatedEvents

RealTime

TechnologyPartnershipsKepware,AWSIoT,Cisco,PaloAlto

MaintenanceInfo

AssetInfo

DataStores

ExternalLookups/Enrichment

22

OT

IndustrialAssets

IT

ConsumerandMobileDevices

23

SenseandRespond

RealTime Search Alert

Third-PartyApplications

SmartphonesandDevices

Tickets

Email

Sendanemail

Fileaticket

Sendatext

Flashlights

Triggerprocessflow

23

OT

IndustrialAssets

IT

ConsumerandMobileDevices

EverySearchCanUseMachineLearning

24

Splunk:DataFabric

24

OT

IndustrialAssets

IT

ConsumerandMobileDevices

RealTime

ITusers Analysts BusinessUsers

AdHocSearch

CustomDashboards

MonitorandAlert

Reports/Analyze

Clickstreams HadoopDevices Networks

GPS/Cellular

OnlineShoppingCarts

Servers Applications

Analysts BusinessUsers

DataWarehouses

StructuredDataSources

CRM ERP HR Billing Product Finance

DBConnectLook-ups

ODBCSDKAPI

Differentlenses intothesamedata

SCADAOpsCenter BizOpsCenter

ITOpsCenter

Compliance

SecurityOpsCenter

DataReuse=GreaterDataLeverageFraudOpsCenter,etc…

Copyright©2016SplunkInc.

MachineLearningUseCases

27

MLIsAllAroundYou!Recall:EXPLORE>FIT>APPLY>VALIDATE>REPEAT

• Facedetection:findfacesinimages

• Spamfiltering:identifySPAMmessages

• ShoppingRecommendations:predictwhatcustomerswouldliketobuy

• Frauddetection:identifycreditcardtransactionswhichmaybefraudulentinnature

• Weatherforecast:predictwhetherornotitwillraintomorrow;estimatedailymax/min

28

MachineLearningCustomerSuccess

NetworkIncidentDetectionServiceDegradationDetection Security/FraudPrevention

PrioritizeWebsiteIssuesandPredictRootCause

PredictGamingOutagesFraudPrevention

MachineLearningConsultingServices AnalyticsAppbuiltonMLToolkit

Optimizingoperationsandbusinessresults

CellTowerIncidentDetectionOptimizeRepairOperations

Entertainment Company

15

29

MLToolkitCustomerUseCases

29

Speedingwebsiteproblemresolutionbyautomaticallyrankingactionsforsupportengineers

Reducingcustomerservicedisruptionwithearlyidentificationofdifficult-to-detectnetworkincidents

Minimizingcelltowerdegradationanddowntimewithimprovedissuedetectionsensitivity

Improvingcelltoweruptimeandreducingrepairtruckroleswithanomalydetectionandrootcauseanalysis

Predictingandavertingpotentialgamingoutageconditionswithfiner-graineddetection

EnsuringmobiledevicesecuritybydetectinganomaliesinIDauthentication

PreventingfraudbyIdentifyingmaliciousaccountsandsuspiciousactivitiesEntertainment Company

30

DetectNetworkOutliersReduceddowntime+increasedserviceavailability=bettercustomersatisfaction

30

MLUseCase Monitornoiserisefor20,000+celltowerstoincreaseserviceanddeviceavailability,reduceMTTR

Technicaloverview • Acustomizedsolutiondeployedinproductionbasedonoutlierdetection.• Leveragepreviousmonthdataandvotingalgorithms

“TheabilitytomodelcomplexsystemsandalertondeviationsiswhereITandsecurityoperationsareheaded…SplunkMachineLearninghasgivenusaheadstart...”

31

ReliablewebsiteupdatesProactivewebsitemonitoringleadstoreduceddowntime

31

“SplunkMLhelpsusrapidlyimproveend-userexperiencebyrankingissue severitywhichhelpsusdeterminerootcausesfasterthusreducingMTTRandimprovingSLA”

• Veryfrequentcodeandconfig updates(1000+daily)cancausesiteissues• Finderrorsinserverpools,thenprioritizeactionsandpredictrootcause

• CustomoutlierdetectionbuiltusingMLToolkitOutlierassistant• BuiltbySplunkArchitectwithnoDataSciencebackground

MLUseCase

Technicaloverview

Copyright©2016SplunkInc.

NextSteps!

33

NextStepswithSplunkML• ReachouttoyourTechTeam!WecanhelparchitectMLworkflows.• LotsofMLcommandsinCoreSplunk(predict,anomalydetection,stats)

• MLToolkit&Showcase– availableandfree,readytouse

• SplunkITSI:AppliedMLforITOAusecases– Manage1000sofKPIs&alerts– AdaptiveThresholding&AnomalyDetection

• SplunkUBA:AppliedMLforSecurity– UnsupervisedlearningofUsers&Entities– SurfacesAnomalies&Threats

• MLCustomerAdvisoryProgram:– ConnectwithProduct&Engineeringteams- mlprogram@splunk.com

34

WhatElse?• GettheMachineLearningToolkitfromSplunkbase• GowatchMachineLearningVideosonSplunkYoutube Channel

http://tiny.cc/splunkmlvideos

• Go watchtheMachineLearningstalksfromConf 2016:– AdvancedMachineLearninginSPLwiththeMachineLearningToolkitbyJacob

Leverich– ExtendingSPLwithCustomSearchCommandsandtheSplunkSDKforPythonby

JacobLeverich

• EarlyAdopterAndCustomerAdvisoryProgram:mlprogram@splunk.com• FieldMLArchitects:AndrewStein(astein@),BrianNash(bnash@)

Copyright©2016SplunkInc.

ThankYou

36

What’sNewsinceour0.9BetaRelease(lastyear’s.conf)?

36

• Newnameandabbreviation;-)• Noeventlimits(removalof50Klimitonfittingmodels)

• Configurableresourcecapsviamlspl.conf

• Searchheadclusteringsupport• Distributed/streamingapply• Scheduledfit• Newalgorithms(nextslide)

– Featureengineeringandselection– Stochasticgradientdescent(e.g.)– ARIMA

• Multi-algorithmsupportacrossAssistants

• Scatterplotmatrixviz• Alerting• Tooltips• In-apptours• ClusterNumericEventsassistant• VideosvideosvideosforeachassistantacrossIT,Security,IoT andBusinessAnalytics

• ML-SPLCheatSheet