SaltConf14 - Brendan Burns, Google - Management at Google Scale

Post on 11-May-2015

1,165 views 0 download

Tags:

description

As a leading developer of highly scalable, large-scale Web services, Google was forced early on to develop systems to support the deployment and management of diverse workloads at an immense scale. As the broader developer community embraces cloud technologies we see significant parallels between the internal management infrastructure which Google has built over the last decade, and open source management technologies of today. This talk will describe Google's experience in managing large-scale compute services, draw parallels to open source efforts underway today, and sketch out how our past experience shapes our future development of the Google Cloud Platform.

transcript

Google confidential │ Do not distribute

Management at Google ScaleConverging managed infrastructure between Google and the Cloud community

Brendan BurnsStaff Software Engineer

Storage

Cloud Storage Cloud SQLCloud

Datastore

Compute

Compute Engine

App Engine

App Services

BigQuery Cloud Endpoints

Google Cloud Platform

Google confidential │ Do not distribute

For the past 15 years, Google has been building the world’s fastest, most powerful, highest quality cloud infrastructure on the planet.

Images by Connie Zhou

We’ve had some practice

Declarative Management for sanity

Containers for idempotency and reproducibility

So, what have we learned?

Task Introspection (or how I learned to forget about SSH)

A view into my life

• Google engineer for 6 years

• Search Infrastructure (Realtime Search, Google+ Search …)

• Cloud Infrastructure

• Build software to expect failure

• Never had root@gooogle.com, despite web search oncall for 4+ years

Declarative Management for sanity

Containers for idempotency and reproducibility

So, what have we learned?

Task Introspection (or how I learned to forget about SSH)

Imperative management leads to “Snowflake” Servers

Declarative Management

Separate textual declaration from Physical (Virtual) Manifestation

Declarative Management

Reasoning in a formal declaration (and version control) unlocks tremendous potential

Declarative Management

Declarative configurations facilitate re-use

Declarative Management

Declarative Management for sanity

Containers for idempotency and reproducibility

So, what have we learned?

Task Introspection (or how I learned to forget about SSH)

Google has a long history with containers (Process GGroups, LMCTFY [https://github.com/google/lmctfy])

Of late, there has been a great deal of external interest as well.

Containers

Google has a long history with containers (Process CGroups, LMCTFY [https://github.com/google/lmctfy])What containers are good for?

Containers

Declarative Management for sanity

Containers for idempotency and reproducibility

So, what have we learned?

Task Introspection (or how I learned to forget about SSH)

(or how I learned to forget about SSH)C

ontainers don’t really have SSH (well, they can, but…)

Still want containers to be self-contained

Introspection

?

?

There’s an exciting road ahead...

bburns@gooogle.com

Eric Johnson’s talk

I’ll be at the Google booth

Walk up and say “Hi”

Resources/Contact

Thomas Hatch, SaltStack CTO

The Top Six Things You Didn’t Know About SaltStack

1. Fast, flexible comms protocol

• SaltStack provides options• Different solutions for different problems• Flexibility and plug-ability• ØMQ

– Super fast• SSH

– For certain use cases– 50x faster than other other SSH-based tools

• RAET

– UDP or TCP– Even faster– More control over job queuing and prioritization– More infrastructure visibility

2. Salt Virt

• Doesn’t get much attention• Salt originally designed as a

cloud controller (Butter)• A completely different approach

to cloud management– Database free– Evolving but being used in production

3. Declarative or imperative? Yes.• Stick a fork in this debate• Most flexible configuration management• Finite order execution is a core Salt

design principle• 0.17 introduced more state ordering

choice• Compiler and run time

– Salt modularity– No sacrifice or compromise of speed

4. Generic device automation

• Minion proxy for network devices (Juniper, Arista, Broadcom, F5, etc.)

• Not just executing CM routines• Finite device control w/ remote execution• Easy to communicate with and control these

typically dumb devices• Stateful configuration and one-off queries• Integrated with standard Salt workflows and

methodologies

5. The Salt test suite

• More stable Salt releases• Pedro Algarvio!• Running lives tests constantly on real infra

– Jenkins– Spinning up VMs on Rackspace to run tests– Hooked into Docker containers

• PyLint coverage (thx Hulu & LogiLab)• Test coverage doubled in three months

6. The SaltStack name

• Not SLC• FLOSS Weekly

realization• Gimli, son of Gloin• Ubiquitous nature of Salt

Thank You