Opportunities & Challenges in Adopting Microservice Architecture ...

Post on 13-Feb-2017

222 views 4 download

transcript

Opportunities&ChallengesinAdoptingMicroserviceArchitectureforEnterpriseWorkloads

ShriramRajagopalan,PriyaNagpurkar,TamarEilam,

andHaniJamjoomEtaiLev-Ran,andVitaBortnikov

IBMWatson Research IBMResearch,Haifa

FrankBudinsky

IBM

Contact:shriram@us.ibm.com

• Emergenceofmicroservices&DevOps

• Challenges&Opportunities

• AdoptingaSDNperspectiveofmicroservices

• Version/Content-awarerouting

• Systematicresiliencetesting

2

• Emergenceofmicroservices&DevOps

• Challenges&Opportunities

• AdoptingaSDNperspectiveofmicroservices

• Version/Content-awarerouting

• Systematicresiliencetesting

3

FromMonolithstoMicroservices

MonolithicServiceinstances

Microserviceinstances

• Eachserviceservesasinglepurpose (functionality)

• Manyloosely-coupledmicroservicescommunicateoverthenetwork

Well-definedAPI

4

• Asingleserviceservesmultiplepurposes

• Tight-coupling acrossservices

FromWaterfalltoDevOps

5

Plan Develop Test Deploy

monthstoyears

Features,performance improvements,bugfixes,etc.,areperiodically deliveredasonebigupdate

hourstodays

Continuousdeliveryofincrementalupdates

P T DD P T DD P T DD P T DD P T DD

Plan Develop Test Deploy

Culture+Automation +Instrumentation

Emphasizesconstantexperimentation&feedback-drivendevelopment

Microservices+DevOps

• Polyglotapplicationswithloosely-coupledmicroservices

• Small“twopizza”teamspermicroservice– Autonomy&accountability– Owntheroadmapforthe

feature/service– Independent launchschedules

• Develop,deploy, scale– “Youbuildit,yourunit”

• 10sto100sofdeploymentsadayacrosstheapplication– E.g.,Orbitz,GrubHub,HubSpot

• Multipleversionsco-existsimultaneously

6

Users

MicroserviceA

B

C

D D’

Application

F

RDBMS

MessageBus

NoSQL

CloudPlatformServices

3rdpartyInternet Services

SocialMedia

MobilePushNotification

Ruby

Node.js

Go Java

“Traditional”EnterprisesaremovingorhavemovedtoMicroservices+DevOps

7

• Emergenceofmicroservices&DevOps

• Challenges&Opportunities

• AdoptingaSDNperspectiveofmicroservices

• Version/Content-awarerouting

• Systematicresiliencetesting

8

Opportunities

• Enterprisesare– Re-architecting legacyapplicationstomicroservicearchitecture– Developingin-houseplatforms tohostsensitiveappsonpremise

• E.g.Fidelity’sMako

– Stillexperimentingwithdifferentdesignalternatives– Heavilyleveragingopen-sourcetechnologies

• Opportunityfortheresearchcommunitytoengage– Influenceinfrastructure&applicationdesign– Integrateideasintoopen-sourceplatformsandsolutions

9

Challenges

• 10sto100sofdeploymentsadayacrosstheapplication

• Multipleversionsco-existsimultaneously

• Complexityshiftedtothenetworkandorchestrationacrossservices

• Cascading failuresdespitethemicroservicesbeingdesignedforfailure

10

Users

MicroserviceA

B

C

D D’

Application

F

RDBMS

MessageBus

NoSQL

CloudPlatformServices

3rdpartyInternet Services

Facebook

MobilePushNotification

Ruby

Node.js

Go

Ad-hocDesigns&Implementations

11

• TwoOptions:

• Adoptopen-sourceframeworksfromlargescaleinternetapplications(e.g.,NetflixOSS)• Theseframeworksarepointsolutions thatfittheneeds&environment of

thecompaniesthatoperatetheseapplications(e.g.,Javaonlysupport)

• Shoehorntheservice-orientedwebapplicationintoclusteringframeworkslikeKubernetes,Marathon,etc.,andwritead-hoctoolsontoptocontrolthemicroservices

• Emergenceofmicroservices&DevOps

• Challenges&Opportunities

• AdoptingaSDNperspectiveofmicroservices

• Version/Content-awarerouting

• Systematicresiliencetesting

12

MicroserviceApplicationRequirements

• Integration– Serviceregistration&discovery– Loadbalancingofrequestsacrossmicroservice instances

• Version&content-awarerouting– Hypothesisdriven-development (i.e.A/Btesting)– Canarydeployments (featurereleaseto%ofusers)– Red/Blackdeployments (gradual rollout toallusers)– Etc.

• Operationaltestinginproduction– E.g.,doesfailurerecoveryworkasexpected?

13

IntroducingAmalgam8

• Observation:– Microservices interactonlyoverthenetwork predominantly usingHTTP(s)– Existingsolutions lacktheabilitytodynamicallycontroltheroutingof

requestsbetweentwomicroservices

• Insight:– Thinkofrequestsaspacketsandmicroservicesasswitches– ALayer-7SDNwillsimplify integrationandrouting

• Design:– Sidecar:Aprogrammable layer-7proxyprocess attachedtoeachmicroservice– Controller:TheequivalentofanSDNcontroller,exceptatLayer-7

14

RequestsA'

B

B’DataPlanew/TenantApps

Controller,ServiceRegistry

API

Multi-tenantControlPlane

C

SimplifyingIntegration

15Kubernetes,Marathon,Swarm,VMs,BareMetal

A

Sidecar

Tenant1

Tenant2

Tenant3

• Emergenceofmicroservices&DevOps

• Challenges&Opportunities

• AdoptingaSDNperspectiveofmicroservices

• Version/Content-awarerouting

• Systematicresiliencetesting

16

RequestsA'

B

B’DataPlanew/TenantApps

Controller,Registry

API

upgradefromBtoB’

Multi-tenantControlPlane

VersionRouting

C

Send35%ofiphonetraffictoA’and65%toA

17Kubernetes,Marathon,Swarm,VMs,BareMetal

A

Sidecar

Analytics

Canarydeployments

Red/Blackdeploy…

ActiveDeploy

Tenant1

Tenant2

Tenant3

Auto-rollbackifB’farespoorlycomparedtoB,withagivenconfidencemeasureRef.toCanaryAdvisor,ISSTA2015

• Emergenceofmicroservices&DevOps

• Challenges&Opportunities

• AdoptingaSDNperspectiveofmicroservices

• Version/Content-awarerouting

• Systematicresiliencetesting

18

ResilienceTesting

• Microservicesdesignedbut“seldom”testedforfailures

• Randomizedfaultinjection(e.g.,NetflixChaosMonkey)isinsufficient– Manualefforttovalidatewhetherapplicationrecoveredproperlyornot

• Gremlin– systematicresiliencetesting– Scriptfailurescenariosandexpectations– Faultsinjected fromthenetwork– Runassertionson thelogstovalidateexpectations– Exposesfaultyrecoverybehavior, conflicting failurehandlingpoliciesacross

services,etc.

19

RequestsA'

B

B’DataPlanew/TenantApps

Controller,Registry

API

Multi-tenantControlPlane

VersionRouting

C

Overload(C)Assert (A’respondsin10ms)

20

FaultInjection

Kubernetes,Marathon,Swarm,VMs,BareMetal

A

Sidecar

GremlinResilienceTesting

Tenant1

Tenant2

Tenant3

Ref.toGremlin,ICDCS2016

Failuresareemulatedbymanipulatingnetworkinteractionsbetweenservices

(e.g.,delays,HTTP500s,etc.)

Assertionsarevalidatedagainstrequestlogstoidentifyfaultyrecoverybehavior

ThankYou

• https://amalgam8.io

• https://github.com/amalgam8/examples

21

Backup

22

ResearchChallengesintheFaceofContinuousChange

• Managingstatefulservicesanddatastores

• Problemdeterminationgainsmanydimensions– Theproblemmaynotjustbeinyourcode– Manydimensions changesimultaneously suchasinfrastructure, runtime, etc.– Canwepinpoint theissuedowntotheGit commitbycorrelatingruntime logs

anddevelopmenthistory?

• Toomuchdata,toolittleinsights– Logsemittedbyalllayersofthesoftwarestack,byautomatedbuild tools,etc.– Yet,wearenowhereclosetopinpointing theproblemandfixingitwhen

thingsgowrong!

23

OpportunitiestoFixIssuesBeforeTheyOccur

• Softwarebuild,testanddeploymentphasesarecompletelyautomated

• Providesauniqueopportunitytocatchsecurityvulnerabilities,buggyimplementations,etc.,evenbeforesoftwareisdeployed

• However,existingtoolsandtechniquesdonotscaletotheextremecodechurn(100sofdeployments)

24