Embracing the Monolith

Post on 23-Feb-2017

65 views 0 download

transcript

Doubling down on python to move fast without breaking things.

Embracing the Monolith in Small Teams

Leon Sasson @leonsasson

PyData Chicago 2016

Rise Science

Rise Science

Product Goals

• Sleep Improvement • User Enjoyment

Iterate Fast

Young company, timeline of weeks and days.

Data is core to the product

No data = 😩

Development Cycle

Hypothesis

Exploration

Experiment

Productizing

Evaluate & Analyze

Easy, right?

😓

Obstacles

• Data Silos

• Data Silos • Different Tooling

• Data Silos • Different Tooling • People

• Data Silos • Moving from phase requires different tools • People • "It works on staging"

• Testing data products is hard • Garbage in → Garbage out • Capacity problems

• Data Silos • Moving from phase requires different tools • People • "It works on staging"

Extended Product Cycles

How do we start?

Descriptive, visuals, basic summaries

Step Back

What the organization needs.

Understand problem before getting into solutions

Solution First

Focus is on tech trade-off

Solution First

Focus is on tech trade-off

Problem First

Focus is on making progress for the org

vs.

Solution First

Focus is on tech trade-off

Problem First

Focus is on making progress for the org

vs.

Solution First

Focus is on tech trade-off

Problem First

Focus is on making progress for the org

vs.

Business Optimality

Technical Optimality

What's the least I can do to solve the problem?

What's the least I can do to solve the problem?

You need an architecture compatible with this mindset

Monolithic Architecture

© Martin Fowler: http://martinfowler.com/articles/microservices.html

A monolithic application puts all its functionality into a singles process..

... and scales by replicating the monolith

on multiple servers

© Martin Fowler: http://martinfowler.com/articles/microservices.html

A microservices architecture puts each

element of functionality into a separate service..

... and scales by distributing these services across servers, replicating as

needed

© Martin Fowler: http://martinfowler.com/articles/microservices.html© Martin Fowler: http://martinfowler.com/articles/microservices.html

Django. The Good Things

Reuse Libraries

IPython Notebooks

Reuse your ORM when accessing data.

Pandas, django-pandas

Instrumentation

People

The Problem of Toil

".. manual, repetitive, automatable, tactical, devoid of enduring value, and scales linearly as a service grows.."

Toil-induced negative data culture

Self-Serve Analytics

Breaking Data Silos

Why do Data Silos Happen?

person id date duration

1 2016-08-01 450

2 2016-08-01 426

1 2016-08-02 438

Row

person id date duration

1

2

1

2016-08-01

2016-08-01

2016-08-02

450

426

438

Columnar

Centralizing Data

Segment.com

Backend DB

RedshiftETL

Redshift is fast for aggregations

Out-of-the-box compatible with Postgres

(Mostly..)

Bring data to the people

Positive Feedback Loop on Data Culture

Non tech can access data whenever Data team can focus on bigger problems and act as enablers

Be scrappy.

Thanks!