A Microservice Architecture for Big Data Pipelines

Post on 14-Apr-2017

241 views 1 download

transcript

A Microservice Architecture for Big Data PipelinesBigData.be Meetup June 2016

Let’s face it: Big Data is no longer a Big Deal

2Image © User:Kleiner / Wikimedia Commons / CC BY-SA 3.0

www.realimpactanalytics.com

Yardsticks of Software Development:

1. Create Modularity

2. Ensure Quality

3. Scale Development

4. Painless Deployment

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Image © User:Guma89 / Wikimedia Commons / CC BY-SA 3.0

www.realimpactanalytics.com

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Modularity is Imperative for On Premise Deployments:

RealImpact

Product Product Product

Client Client Client

The Promised Land

5Image http://hanciong.deviantart.com/art/old-world-map-253195357

www.realimpactanalytics.com

Micro Services: Maximal Modularity

1. No shared state

2. Minimal coupling

3. Separation of concerns

4. Mix & match

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

www.realimpactanalytics.com

Micro Services: Scalable Development

1. Team responsibility

2. Less code = faster ramp up

3. Technology independence

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

www.realimpactanalytics.com

Micro Services: Painless Deployment

1. Reproducible environments

2. Versioned APIs

3. Installation = docker-compose up

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Prod

Dev

www.realimpactanalytics.com

Micro Services: QA Friendly

1. Three levels of testing

• Class / function level

• Service level

• Integration level

2. Staging is no big deal

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

www.realimpactanalytics.com

Translation to Big Data Pipelines…

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

TrendingAnalysis

Twitter Data

TopTweeters

Recommend

www.realimpactanalytics.comcontainer

Translation to Big Data Pipelines…

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

TrendingAnalysis

manifest.yaml

run.sh

jar

runtime

www.realimpactanalytics.comcontainer

Translation to Big Data Pipelines…

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

TrendingAnalysis

datasources: - twitter

outputs: - id: daily-trends fields: - name: keyword type: string - name: relevance type: integer

parameters: …

manifest.yaml

run.sh

jar

runtime

www.realimpactanalytics.com

Translation to Big Data Pipelines…

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

TrendingAnalysis

HDFS

Input Data

Result

Parameters

Demo

14

www.realimpactanalytics.com

Data Modules: QA Friendly?

1. Three levels of testing ✔

• Class / function level

• Module level

• Integration level

2. Staging is no big deal ✔

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

www.realimpactanalytics.com

Data Modules: Painless Deployment?

1. Reproducible environments (✔)

2. Versioned APIs ✔

3. Installation = docker-compose up (✔)

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Prod

Dev

www.realimpactanalytics.com

Data Modules: Scalable Development?

1. Team responsibility ✔

2. Less code = faster ramp up ✔

3. Technology independence ✔

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

www.realimpactanalytics.com

Data Modules: Modularity?

1. No shared state (well…)

2. Minimal coupling ✔

3. Separation of concerns ✔

4. Mix & match ✔

1. Challenge: Big Data in Production

2. Zen of Micro Services

3. Data Modules

4. Conclusion

Conclusion

19

Brussels Office 5, Place du Champ de Mars 1050 Brussels Belgium

Cape Town Office Sovereign Quay, 34 Somerset Road8005, Green Point, Cape Town South Africa

São Paulo Office 93, Rua Doutor Andrade Pertence Vila Olímpia, São Paulo Brazil

Luxembourg Office 691, rue de Neudorf 2220 Luxembourg Grand Duché du Luxembourg

www.realimpactanalytics.com

Kuala Lumpur Office 28-01, Integra Tower 348 JalanTun Razak, 50400 Kuala Lumpur Malaysia