+ All Categories
Home > Technology > Cluster management with Kubernetes

Cluster management with Kubernetes

Date post: 15-Aug-2015
Category:
Upload: satnam-singh
View: 110 times
Download: 5 times
Share this document with a friend
69
Cluster Management with Kubernetes Please open the gears tab below for the speaker notes Satnam Singh [email protected] Work of the Google Kubernetes team and many open source contributors University of Edinburgh, 5 June 2015
Transcript

Cluster Management with Kubernetes

Please open the gears tab below for the speaker notesSatnam Singh [email protected]

Work of the Google Kubernetes team and many open source contributors

University of Edinburgh, 5 June 2015

The promise of cloud computing

Cloud software deployment is soul destroyingTypically a cloud cluster node is a VM running a specific version of Linux.

User applications comprise components each of which may have different and conflicting requirements from libraries, runtimes and kernel features.

Applications are coupled to the version of the host operating system: bad.

Evolution of the application components is coupled to (and in tension with) the evolution of the host operating system: bad.

Also need to deal with node failures, spinning up and turning down replicas to deal with varying load, updating components with disruption …

You thought you were a programmer but you are now a sys-admin.

Docker

Source: Google Trends

What is Docker?

An implementation of the container idea

A package format

Resource isolation

An ecosystem

Virtual Machines workloads?We need to isolate the application components from the host environment.

VM vs. Docker

Docker

“build once, run anywhere”

Resource isolation

Implemented by a number of Linux APIs:

• cgroups: Restrict resources a process can consume• CPU, memory, disk IO, ...

• namespaces: Change a process’s view of the system• Network interfaces, PIDs, users, mounts, ...

• capabilities: Limits what a user can do• mount, kill, chown, ...

• chroots: Determines what parts of the filesystem a user can see

We need more than just packing and isolation

Scheduling: Where should my containers run?

Lifecycle and health: Keep my containers running despite failures

Discovery: Where are my containers now?

Monitoring: What’s happening with my containers?

Auth{n,z}: Control who can do things to my containers

Aggregates: Compose sets of containers into jobs

Scaling: Making jobs bigger or smaller

...

Google confidential │ Do not distribute

Everything at Google runs in containers:• Gmail, Web Search, Maps, ...• MapReduce, MillWheel, Pregel, ...• Colossus, BigTable, Spanner, ...• Even Google’s Cloud Computing

product GCE itself: VMs run in containers

Google confidential │ Do not distribute

Open Source Containers: Kubernetes Greek for “Helmsman”; also the root of the word “Governor” and “cybernetic”• Container orchestrator• Builds on Docker containers

• also supporting other container technologies• Multiple cloud and bare-metal environments• Supports existing OSS apps

• cannot require apps becoming cloud-native• Inspired and informed by Google’s experiences

and internal systems• 100% Open source, written in Go

Let users manage applications, not machines

Primary concepts Container: A sealed application package (Docker)

Pod: A small group of tightly coupled Containers

Labels: Identifying metadata attached to objects

Selector: A query against labels, producing a set result

Controller: A reconciliation loop that drives current state towards desired state

Service: A set of pods that work together

Application Containers

Homogenous Machine Fleet (Virtual or Physical)

Kubernetes API: Unified Compute Substrate

Kubernetes Architecture

etcd API Server

Scheduler

Controller Manager

Kubelet

Service Proxy

kubectl, ajax, etc

Modularity

Loose coupling is a goal everywhere• simpler• composable• extensible

Code-level plugins where possible

Multi-process where possible

Isolate risk by interchangeable parts

Example: ReplicationControllerExample: Scheduler

Reconciliation between declared and actual state

Control loops

Drive current state -> desired state

Act independently

APIs - no shortcuts or back doors

Observed state is truth

Recurring pattern in the system

Example: ReplicationController

observe

diff

act

Atomic storage

Backing store for all master state

Hidden behind an abstract interface

Stateless means scalable

Watchable• this is a fundamental primitive• don’t poll, watch

Using CoreOS etcd

Pods: Grouping containers

Container Foo

Namespaces- Net- IPC- ..

Container Bar

Pods: Networking

Container Foo

Container Bar

Namespaces- Net- IPC- ..

Pods: Volumes

Container Foo

Container Bar

Namespaces- Net- IPC- ..

Pods: Labels

Container Foo

Container Bar

Namespaces- Net- IPC- ..

Google confidential │ Do not distribute

Userowned

Adminowned

Persistent Volumes

A higher-level abstraction - insulation from any one cloud environment

Admin provisions them, users claim them

Independent lifetime and fate

Can be handed-off between pods and lives until user is done with it

Dynamically “scheduled” and managed, like nodes and pods Pod

ClaimRef

PVClaim

PersistentVolume

GCE PD AWS ELB

NFSiSCSI

Labels Arbitrary metadata

Attached to any API object

Generally represent identity

Queryable by selectors• think SQL ‘select ... where ...’

The only grouping mechanism

Use to determine which objects to apply an operation to

• pods under a ReplicationController• pods in a Service• capabilities of a node (scheduling constraints)

App: NiftyPhase: Dev

Role: FE

App: NiftyPhase: Dev

Role: BE

App: NiftyPhase: Test

Role: FE

App: NiftyPhase: Test

Role: BE

Selectors

App: NiftyPhase: Dev

Role: FE

App: NiftyPhase: Test

Role: FE

App: NiftyPhase: Dev

Role: BE

App: NiftyPhase: Test

Role: BE

App == NiftyApp: NiftyPhase: Dev

Role: FE

App: NiftyPhase: Test

Role: FE

App: NiftyPhase: Dev

Role: BE

App: NiftyPhase: Test

Role: BE

Selectors

App == NiftyRole == FEApp: Nifty

Phase: DevRole: FE

App: NiftyPhase: Test

Role: FE

App: NiftyPhase: Dev

Role: BE

App: NiftyPhase: Test

Role: BE

Selectors

App == NiftyRole == BEApp: Nifty

Phase: DevRole: FE

App: NiftyPhase: Test

Role: FE

App: NiftyPhase: Dev

Role: BE

App: NiftyPhase: Test

Role: BE

Selectors

App == NiftyPhase == DevApp: Nifty

Phase: DevRole: FE

App: NiftyPhase: Test

Role: FE

App: NiftyPhase: Dev

Role: BE

App: NiftyPhase: Test

Role: BE

Selectors

App == NiftyPhase == Test

App: NiftyPhase: Dev

Role: FE

App: NiftyPhase: Test

Role: FE

App: NiftyPhase: Dev

Role: BE

App: NiftyPhase: Test

Role: BE

Selectors

Pod lifecycle

Once scheduled to a node, pods do not move• restart policy means restart in-place

Pods can be observed pending, running, succeeded, or failed• failed is really the end - no more restarts• no complex state machine logic

Pods are not rescheduled by the scheduler or apiserver• even if a node dies• controllers are responsible for this• keeps the scheduler simple

Apps should consider these rules• Services hide this• Makes pod-to-pod communication more formal

Replication Controllers

production

backend

production

backendproduction

backend

#N

Replication Controllers

A type of controller (control loop)

Ensure N copies of a pod always running• if too few, start new ones• if too many, kill some• group == selector

Cleanly layered on top of the core• all access is by public APIs

Replicated pods are fungible• No implied ordinality or identity

Other kinds of controllers coming• e.g. job controller for batch

Replication Controller- Name = “nifty-rc”- Selector = {“App”: “Nifty”}- PodTemplate = { ... }- NumReplicas = 4

API Server

How many?

3

Start 1 more

OK

How many?

4

Services

production

backend

production

backendproduction

backend

port(s)

name1.2.3.4“name”

Services

10.0.0.1 : 9376

Client

kube-proxy

Service- Name = “nifty-svc”- Selector = {“App”: “Nifty”}- Port = 9376- ContainerPort = 8080

Portal IP is assigned

iptablesDNAT

TCP / UDP

apiserver

watch10.240.2.2 : 808010.240.1.1 : 8080 10.240.3.3 : 8080

TCP / UDP

A Kubernetes cluster on Google Compute Engine

A Kubernetes cluster on Google Compute Engine

A fresh Kubernetes cluster

Node 0f64: logging

Node 02ej: logging, monitoring

Node pk22: logging, DNS

Node 27gf: logging

A counter pod

apiVersion: v1kind: Podmetadata: name: counter namespace: demospec: containers: - name: count image: ubuntu:14.04 args: [bash, -c, 'for ((i = 0; ; i++)); do echo "$i: $(date)"; sleep 1; done']

A counter pod

$ kubectl create -f counter-pod.yaml --namespace=demopods/counter

$ kubectl get podsNAME READY REASON RESTARTS AGEfluentd-cloud-logging-kubernetes-minion-1xe3 1/1 Running 0 5mfluentd-cloud-logging-kubernetes-minion-p6cu 1/1 Running 0 5mfluentd-cloud-logging-kubernetes-minion-s2dl 1/1 Running 0 5mfluentd-cloud-logging-kubernetes-minion-ypau 1/1 Running 0 5mkube-dns-v3-55k7n 3/3 Running 0 6mmonitoring-heapster-v1-55ix9 0/1 Running 12 6m

Node 27gf: logging, counter

Observing the output of the counter

$ kubectl logs counter --namespace=demo0: Tue Jun 2 21:37:31 UTC 20151: Tue Jun 2 21:37:32 UTC 20152: Tue Jun 2 21:37:33 UTC 20153: Tue Jun 2 21:37:34 UTC 20154: Tue Jun 2 21:37:35 UTC 20155: Tue Jun 2 21:37:36 UTC 2015...

ssh onto node and “ps”

# docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES532247036a78 ubuntu:14.04 "\"bash -c 'i=0; whi About a minute ago Up About a minute k8s_count.dca54bea_counter_demo_479b8894-0971-11e5-a784-42010af00df1_f6159d408cd07658287d gcr.io/google_containers/pause:0.8.0 "/pause" About a minute ago Up About a minute k8s_POD.e4cc795_counter_demo_479b8894-0971-11e5-a784-42010af00df1_7de2fec0b2dc87db6608 gcr.io/google_containers/fluentd-gcp:1.6 "\"/bin/sh -c '/usr/ 16 minutes ago Up 16 minutes k8s_fluentd-cloud-logging.463ca0af_fluentd-cloud-logging-kubernetes-minion-27gf_default_4ab77985c0cb4f28a020d3b097af9654_3e908886c5d8641d884d gcr.io/google_containers/pause:0.8.0 "/pause" 16 minutes ago Up 16 minutes k8s_POD.e4cc795_fluentd-cloud-logging-kubernetes-minion-27gf_default_4ab77985c0cb4f28a020d3b097af9654_2b980b91

Example: Music DB + UI

http://music-db:9200

http://music-ui:5601

music-db

music-db

music-db

music-db

music-ui

Example: Elasticsearch + Kibana Music DB & UIapiVersion: v1kind: ReplicationControllermetadata: labels: app: music-db name: music-dbspec: replicas: 4 selector: app: music-db template: metadata: labels: app: music-db spec: containers: - name: es image: kubernetes/elasticsearch:1.0 env: - name: "CLUSTER_NAME" value: "mytunes-db" - name: "SELECTOR" value: "name=music-db" - name: "NAMESPACE" value: "mytunes" ports: - name: es containerPort: 9200 - name: es-transport containerPort: 9300

Music DB Replication Controller

apiVersion: v1kind: ReplicationControllermetadata: labels: app: music-db name: music-dbspec: replicas: 4 selector: app: music-db template: metadata: labels: app: music-db spec: containers: ...

Music DB container

containers: - name: es image: kubernetes/elasticsearch:1.0 env: - name: "CLUSTER_NAME" value: "mytunes-db" - name: "SELECTOR" value: "name=music-db" - name: "NAMESPACE" value: "mytunes" ports: - name: es containerPort: 9200 - name: es-transport containerPort: 9300

Music DB Service

apiVersion: v1kind: Servicemetadata: app: music-db labels: app: music-dbspec: selector: app: music-db ports: - name: db port: 9200 targetPort: es

Music DB

http://music-db:9200

music-db

music-db

music-db

music-db

Music DB Query

Music UI Pod

apiVersion: v1kind: Podmetadata: name: music-ui labels: app: music-uispec: containers: - name: kibana image: kubernetes/kibana:1.0 env: - name: "ELASTICSEARCH_URL" value: "http://music-db:9200" ports: - name: kibana containerPort: 5601

Music UI ServiceapiVersion: v1kind: Servicemetadata: name: music-ui labels: app: music-uispec: selector: app: music-ui ports: - name: kibana port: 5601 targetPort: kibana type: LoadBalancer

Music DB + UI

http://music-db:9200

http://music-ui:5601

music-db

music-db

music-db

music-db

music-uihttp://104.197.86.235:5601

Music UI Query

Scale DB and UI independentlymusic-db

music-db

music-db

music-ui

music-ui

Monitoring

Optional add-on to Kubernetes clusters

Run cAdvisor as a pod on each node• gather stats from all containers• export via REST

Run Heapster as a pod in the cluster• just another pod, no special access• aggregate stats

Run Influx and Grafana in the cluster• more pods• alternately: store in Google Cloud Monitoring

Logging

Optional add-on to Kubernetes clusters

Run fluentd as a pod on each node• gather logs from all containers• export to elasticsearch

Run Elasticsearch as a pod in the cluster• just another pod, no special access• aggregate logs

Run Kibana in the cluster• yet another pod• alternately: store in Google Cloud Logging

Example: Rolling Upgrade with Labels

Servers:

Labels:backend

v1.2

backend

v1.2

backend

v1.2

backend

v1.2

backend

v1.3

backend

v1.3

backend

v1.3

backend

v1.3

backend

Replication Controller

replicas: 4

v1.2

Replication Controller

replicas: 1

v1.3replicas: 3 replicas: 2replicas: 3replicas: 2replicas: 1 replicas: 4replicas: 0

ISA

ISA?

Open source: contribute!

Pets vs. Cattle

Questions?

Images by Connie Zhou

http://kubernetes.io


Recommended