Post on 30-Jun-2015
description
transcript
CoreOSWhat is it and why should I care?
1 / 80
Who amI?
Karl Grzeszczak
Senior Software Engineer - Mediafly
twitter @karl_grz
karlgrz.com2 / 80
At Mediafly, a lot of our infrastructure isservice oriented distributed systemsrunning docker containers
3 / 80
CoreOS seems like an ideal fit for ourneeds, so I decided to investigate
4 / 80
I am -not- affiliated with CoreOS(I'm just curious and wanted to understand it!)
5 / 80
Brief Overview
6 / 80
LightweightCoreOS is designed to be a modern, minimal base to build yourplatform. Consumes 40% less RAM on boot than an averageLinux installation.
https://coreos.com/
7 / 80
Painless UpdatingUtilizes an active/passive dual-partition scheme to updatethe OS as a single unit instead of package by package. Thismakes each update quick, reliable and able to be easily rolledback.
https://coreos.com/
8 / 80
Docker ContainersApplications on CoreOS run as Docker containers. Containersprovide maximum flexibility in packaging and can start inmilliseconds.
https://coreos.com/
9 / 80
Clustered By DefaultCoreOS works well on a single machine, but it's designed tobe clustered. Easily run application containers acrossmultiple machines with fleet and connect them together withservice discovery.
https://coreos.com/
10 / 80
Distributed Systems ToolsBuilt-in primitives such as distributed locking and masterelection are the building blocks for large scale distributedsystems.
https://coreos.com/
11 / 80
Service DiscoveryEasily locate where services are being run within the clusterand be notified when something changes. Essential for acomplex, highly dynamic cluster. Built into CoreOS with highavailability and automatic fail-over.
https://coreos.com/
12 / 80
How is it different from other *NIXes?
13 / 80
No package manager
All your applications should run as a container
Linux kernel, docker, systemd, fleetd, etcd, sshd
According to https://coreos.com, it uses 114MB of RAM atboot, approximately 40% less than average Linux server
Designed specifically for running distributed systems
14 / 80
This is ideal if you already use docker
15 / 80
What do you have to do differently?
16 / 80
What doyou have tododifferently?
etcd service discovery
17 / 80
What doyou have tododifferently?
etcd service discovery
broadcast your applications keyinfrastructure settings back to etcd
18 / 80
What doyou have tododifferently?
etcd service discovery
broadcast your applications keyinfrastructure settings back to etcd
use fleet to orchestrate your containers
19 / 80
etcd
20 / 80
A highly-available key value store forshared configuration and servicediscovery. etcd is inspired by ApacheZooKeeper and doozer
https://github.com/coreos/etcd#readme-version-046
22 / 80
Simple: curl'able user facing API (HTTP+JSON)
Secure: optional SSL client cert authentication
Fast: benchmarked 1000s of writes/s per instance
Reliable: properly distributed using Raft
etcd is written in Go and uses the Raft consensus algorithmto manage a highly-available replicated log.
https://github.com/coreos/etcd#readme-version-046
23 / 80
Raft Concensus Algorithm
24 / 80
In Search of an Understandable Concensus Algorithm byStanford's Diego Ongaro and John Ousterhout
https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf
"As a result, each state machine processes the same seriesof commands and thus produces the same series of resultsand arrives at the same series of states."
http://raftconsensus.github.io/
25 / 80
Basically...
26 / 80
Raft elects a leader, and the leader records a master versionand distributes that to the other nodes in the cluster. It doesnot write a confirmation until it hears back from a concensusof nodes that agree.
If the leader goes AWOL for a certain time, then a newelection process begins to find a new leader and continue.
27 / 80
For now, just understand...Raft is similar to Paxos in fault-tolerance and performanceand it makes sure that etcd and your cluster can continueoperating even if some nodes experience partitions (or areterminated!)
28 / 80
This is an AWESOME animation you should watch because itexplains Raft MUCH better than I can:
http://thesecretlivesofdata.com/raft/
29 / 80
fleet
30 / 80
fleet ties systemd and etcd together into a distributed initsystem
32 / 80
Supported Deployment Patterns
33 / 80
Deploy a single unit anywhere on thecluster
https://github.com/coreos/fleet#supported-deployment-patterns
34 / 80
Deploy a unit globally everywhere in thecluster
https://github.com/coreos/fleet#supported-deployment-patterns
35 / 80
Automatic rescheduling of units onmachine failure
https://github.com/coreos/fleet#supported-deployment-patterns
36 / 80
Ensure that units are deployed togetheron the same machine
https://github.com/coreos/fleet#supported-deployment-patterns
37 / 80
Forbid specific units from colocation onthe same machine (anti-affinity)
https://github.com/coreos/fleet#supported-deployment-patterns
38 / 80
Deploy units to machines only withspecific metadata
https://github.com/coreos/fleet#supported-deployment-patterns
39 / 80
It makes it very easy to know what is running in your cluster,where, and how it's doing
40 / 80
fleet has a LOT of promise, and is myfavorite part of CoreOS
41 / 80
...but...
42 / 80
...it's also my least favorite part ofCoreOS
43 / 80
fleet (0.8) seems very early, rough, andopinionated whereas etcd seems readyfor production
44 / 80
...but it feels like the best option outthere right now
45 / 80
Read this post later:http://lukebond.ghost.io/deploying-docker-containers-on-a-coreos-cluster-with-fleet/
I found this while putting together this presentation, and Ithink it does a great job explaining all this in written form
46 / 80
Show me teh codez
47 / 80
Walkthrough on Vagranthttp://github.com/coreos/coreos-vagrant
https://coreos.com/docs/running-coreos/platforms/vagrant/
48 / 80
bootstrapping the clusterkarl@karl-mediafly:~$ curl discovery.etcd.io/newhttps://discovery.etcd.io/b9845b31a57793fe9f88137220b7f454
49 / 80
Output gets pasted into user-data:#cloud-config
coreos: etcd: # generate a new token for each unique cluster from https://discovery.etcd.io/new # WARNING: replace each time you 'vagrant destroy' discovery: https://discovery.etcd.io/b9845b31a57793fe9f88137220b7f454 addr: $public_ipv4:4001 peer-addr: $public_ipv4:7001 fleet: public-ip: $public_ipv4 units: - name: etcd.service command: start - name: fleet.service command: start - name: docker-tcp.socket command: start enable: true
50 / 80
show all machines in your clustercore@core-01 ~/share $ fleetctl list-machinesMACHINE IP METADATA78e5ab3e... 172.17.8.103 -adddf8be... 172.17.8.102 -df763c2f... 172.17.8.101 -
51 / 80
service unit[Unit]Description=karlgrz.comAfter=docker.serviceRequires=docker.service
[Service]TimeoutStartSec=0ExecStartPre=-/usr/bin/docker kill karlgrz_webExecStartPre=-/usr/bin/docker rm karlgrz_webExecStartPre=/usr/bin/docker pull karlgrz/ubuntu-14.04-base-nginxExecStartPre=/bin/sh -c "cd /srv/karlgrz.com && \ /usr/bin/docker build -t karlgrz/karlgrz_web ."ExecStart=/usr/bin/docker run --name karlgrz_web -p 8001:8001 karlgrz/karlgrz_webExecStop=/usr/bin/docker stop karlgrz_web
52 / 80
start up some unitscore@core-01 ~/share/karlgrz-docker/fleet $ fleetctl start fantasy_web.service \jcsdoorsolutions_web.service stickfigureninjas_web.service karlgrz_web.serviceUnit fantasy_web.service launched on adddf8be.../172.17.8.102Unit karlgrz_web.service launched on adddf8be.../172.17.8.102Unit jcsdoorsolutions_web.service launched on 78e5ab3e.../172.17.8.103Unit stickfigureninjas_web.service launched on 78e5ab3e.../172.17.8.103
53 / 80
list loaded units and their statuscore@core-01 ~/share/karlgrz-docker/fleet $ fleetctl list-unitsUNIT MACHINE ACTIVE SUBfantasy_web.service adddf8be.../172.17.8.102 activating start-prejcsdoorsolutions_web.service 78e5ab3e.../172.17.8.103 activating start-prekarlgrz_web.service adddf8be.../172.17.8.102 active runningstickfigureninjas_web.service 78e5ab3e.../172.17.8.103 activating start-pre
core@core-01 ~/share/karlgrz-docker/fleet $ fleetctl list-unitsUNIT MACHINE ACTIVE SUBfantasy_web.service adddf8be.../172.17.8.102 active runningjcsdoorsolutions_web.service 78e5ab3e.../172.17.8.103 active runningkarlgrz_web.service adddf8be.../172.17.8.102 active runningstickfigureninjas_web.service 78e5ab3e.../172.17.8.103 active running
54 / 80
discovery sidekick[Unit]Description=Announce karlgrz.comBindsTo=karlgrz_web.service[Service]EnvironmentFile=/etc/environmentExecStart=/bin/sh -c "while true; \ do etcdctl set /apps/karlgrz_web \ '{ \"host\": \"karlgrz.com\", \"appkey\": \"karlgrz_web\", \"ip\" :\"${COREOS_PUBLIC_IPV4}\", \"port\" :\"8001\" }' \ --ttl 60; sleep 45; done"ExecStop=/usr/bin/etcdctl rm /apps/karlgrz_web
[X-Fleet]MachineOf=karlgrz_web.service
55 / 80
run discovery sidekickscore@core-01 ~/share/karlgrz-docker/fleet $ etcdctl ls /apps
core@core-01 ~/share/karlgrz-docker/fleet $ fleetctl start fantasy_discovery.service \jcsdoorsolutions_discovery.service stickfigureninjas_discovery.service \karlgrz_discovery.serviceUnit jcsdoorsolutions_discovery.service launched on 78e5ab3e.../172.17.8.103Unit stickfigureninjas_discovery.service launched on 78e5ab3e.../172.17.8.103Unit fantasy_discovery.service launched on adddf8be.../172.17.8.102Unit karlgrz_discovery.service launched on adddf8be.../172.17.8.102
core@core-01 ~/share/karlgrz-docker/fleet $ etcdctl ls /apps/apps/rethinkdb_services/apps/fantasy_web/apps/karlgrz_web/apps/jcsdoorsolutions_web/apps/stickfigureninjas_web
56 / 80
etcd valuescore@core-01 ~/share/karlgrz-docker/fleet $ etcdctl get /apps/karlgrz_web{ "host": "karlgrz.com", "appkey": "karlgrz_web", "ip" :"172.17.8.102", "port" :"8001" }
57 / 80
list unitscore@core-01 ~/share/karlgrz-docker/fleet $ fleetctl list-unitsUNIT MACHINE ACTIVE SUBfantasy_discovery.service adddf8be.../172.17.8.102 active runningfantasy_web.service adddf8be.../172.17.8.102 active runningjcsdoorsolutions_discovery.service 78e5ab3e.../172.17.8.103 active runningjcsdoorsolutions_web.service 78e5ab3e.../172.17.8.103 active runningkarlgrz_discovery.service adddf8be.../172.17.8.102 active runningkarlgrz_web.service adddf8be.../172.17.8.102 active runningrethinkdb_discovery.service df763c2f.../172.17.8.101 active runningrethinkdb_services.service df763c2f.../172.17.8.101 active runningstickfigureninjas_discovery.service78e5ab3e.../172.17.8.103 active runningstickfigureninjas_web.service 78e5ab3e.../172.17.8.103 active running
58 / 80
run a unit on ONLY one SPECIFIC node[Unit]Description=rethinkdbAfter=docker.serviceRequires=docker.service
[Service]TimeoutStartSec=0ExecStartPre=-/usr/bin/docker kill rethinkdb_servicesExecStartPre=-/usr/bin/docker rm rethinkdb_servicesExecStartPre=/usr/bin/docker pull dockerfile/rethinkdbExecStart=/usr/bin/docker run --name rethinkdb_services \ -p 8080:8080 -p 28015:28015 -p 29015:29105 -v /home/core/rethinkdb:/data \ -t dockerfile/rethinkdb rethinkdb -d /data --bind allExecStop=/usr/bin/docker stop rethinkdb_services
[X-Fleet]MachineID=9f152bf8
59 / 80
see logging output from a running containercore@core-01 ~ $ fleetctl journal fantasy_web-- Logs begin at Wed 2014-09-24 21:32:32 UTC, end at Thu 2014-09-25 19:55:26 UTC. --Sep 25 18:22:08 core-02 docker[1572]: Python version: 2.7.6 (default, Mar 22 2014, 23:03:41) [GCC 4.8.2]Sep 25 18:22:08 core-02 docker[1572]: Python main interpreter initialized at 0xc53540Sep 25 18:22:08 core-02 docker[1572]: python threads support enabledSep 25 18:22:08 core-02 docker[1572]: your server socket listen backlog is limited to100 connectionsSep 25 18:22:08 core-02 docker[1572]: your mercy for graceful operations on workers is60 secondsSep 25 18:22:08 core-02 docker[1572]: mapped 72768 bytes (71 KB) for 1 coresSep 25 18:22:08 core-02 docker[1572]: *** Operational MODE: single process ***Sep 25 18:22:09 core-02 docker[1572]: WSGI app 0 (mountpoint='') ready in 1 seconds oninterpreter 0xc53540 pid: 13 (default app)Sep 25 18:22:09 core-02 docker[1572]: *** uWSGI is running in multiple interpreter mode ***Sep 25 18:22:09 core-02 docker[1572]: spawned uWSGI worker 1 (and the only) (pid: 13, cores: 1)
60 / 80
core@core-03 ~ $ fleetctl journal karlgrz_web-- Logs begin at Wed 2014-09-24 21:32:32 UTC, end at Thu 2014-09-25 19:56:33 UTC. --Sep 25 18:21:58 core-03 sh[1315]: ---> Using cacheSep 25 18:21:58 core-03 sh[1315]: ---> ce8cd32fe157Sep 25 18:21:58 core-03 sh[1315]: Step 6 : RUN cd /srv && make publishSep 25 18:21:58 core-03 sh[1315]: ---> Using cacheSep 25 18:21:58 core-03 sh[1315]: ---> 83f7f333889bSep 25 18:21:58 core-03 sh[1315]: Step 7 : CMD ["nginx"]Sep 25 18:21:58 core-03 sh[1315]: ---> Using cacheSep 25 18:21:58 core-03 sh[1315]: ---> 4cf274f01daeSep 25 18:21:58 core-03 sh[1315]: Successfully built 4cf274f01daeSep 25 18:21:59 core-03 systemd[1]: Started karlgrz.com.
61 / 80
core@core-02 ~/share/karlgrz-docker/fleet $ fleetctl journal classholes_web-- Logs begin at Wed 2014-09-24 21:32:01 UTC, end at Thu 2014-09-25 20:03:55 UTC. --Sep 25 20:01:40 core-02 systemd[1]: Starting classholes.com...Sep 25 20:01:40 core-02 docker[3071]: Error response from daemon: No such container:classholes_webSep 25 20:01:40 core-02 docker[3071]: 2014/09/25 20:01:40 Error: failed to kill one ormore containersSep 25 20:01:40 core-02 docker[3085]: Error response from daemon: No such container:classholes_webSep 25 20:01:40 core-02 docker[3085]: 2014/09/25 20:01:40 Error: failed to remove one ormore containersSep 25 20:01:40 core-02 docker[3095]: Pulling repository karlgrz/ubuntu-14.04-base-nginxSep 25 20:01:42 core-02 systemd[1]: classholes_web.service: control process exited, code=exited status=1Sep 25 20:01:42 core-02 systemd[1]: Failed to start classholes.com.Sep 25 20:01:42 core-02 sh[3110]: /bin/sh: line 0: cd: /home/core/share/classholes: No suchfile or directorySep 25 20:01:42 core-02 systemd[1]: Unit classholes_web.service entered failed state.
62 / 80
terminate a node and see the services running on it moved toanother node in the clusterkarl@karl-mediafly:~/workspace/coreos-vagrant$ vagrant ssh core-03 -- -ALast login: Thu Sep 25 16:37:01 2014 from 10.0.2.2CoreOS (beta)core@core-03 ~ $ shutdown -nshutdown: invalid option -- 'n'core@core-03 ~ $ shutdownMust be root.core@core-03 ~ $ sudo shutdown -nshutdown: invalid option -- 'n'core@core-03 ~ $ sudo shutdownShutdown scheduled for Thu 2014-09-25 16:46:14 UTC, use 'shutdown -c' to cancel.Broadcast message from root@core-03 (Thu 2014-09-25 16:45:14 UTC):
The system is going down for power-off at Thu 2014-09-25 16:46:14 UTC!
63 / 80
core@core-02 ~ $ fleetctl list-unitsUNIT MACHINE ACTIVE SUBfantasy_discovery.service adddf8be.../172.17.8.102 active runningfantasy_web.service adddf8be.../172.17.8.102 active runningkarlgrz_discovery.service adddf8be.../172.17.8.102 active runningkarlgrz_web.service adddf8be.../172.17.8.102 active runningrethinkdb_discovery.service df763c2f.../172.17.8.101 active runningrethinkdb_services.service df763c2f.../172.17.8.101 active running
64 / 80
core@core-02 ~ $ fleetctl list-unitsUNIT MACHINE ACTIVE SUBfantasy_discovery.service adddf8be.../172.17.8.102 active runningfantasy_web.service adddf8be.../172.17.8.102 active runningjcsdoorsolutions_discovery.service df763c2f.../172.17.8.101 active runningjcsdoorsolutions_web.service df763c2f.../172.17.8.101 activating start-prekarlgrz_discovery.service adddf8be.../172.17.8.102 active runningkarlgrz_web.service adddf8be.../172.17.8.102 active runningrethinkdb_discovery.service df763c2f.../172.17.8.101 active runningrethinkdb_services.service df763c2f.../172.17.8.101 active runningstickfigureninjas_discovery.servicedf763c2f.../172.17.8.101 active runningstickfigureninjas_web.service df763c2f.../172.17.8.101 activating start-pre
65 / 80
core@core-02 ~ $ fleetctl list-machinesMACHINE IP METADATAadddf8be... 172.17.8.102 -df763c2f... 172.17.8.101 -
66 / 80
core@core-02 ~ $ fleetctl list-unitsUNIT MACHINE ACTIVE SUBfantasy_discovery.service adddf8be.../172.17.8.102 active runningfantasy_web.service adddf8be.../172.17.8.102 active runningjcsdoorsolutions_discovery.service df763c2f.../172.17.8.101 active runningjcsdoorsolutions_web.service df763c2f.../172.17.8.101 active runningkarlgrz_discovery.service adddf8be.../172.17.8.102 active runningkarlgrz_web.service adddf8be.../172.17.8.102 active runningrethinkdb_discovery.service df763c2f.../172.17.8.101 active runningrethinkdb_services.service df763c2f.../172.17.8.101 active runningstickfigureninjas_discovery.servicedf763c2f.../172.17.8.101 active runningstickfigureninjas_web.service df763c2f.../172.17.8.101 active running
67 / 80
Conclusions
68 / 80
Please keep in mind I ran this cluster on my laptop usingVagrant, not on cloud infrastructure
69 / 80
Clustering just worked(I didn't even really have to think about failover or replicationmyself)
70 / 80
alpha softwarefleet and etcd are great, but they both need some more workbefore being "production ready"
71 / 80
fleet in particular gets into situations sometimes where Ihave destroyed a unit but it still shows in the list of units fora while
72 / 80
fleet doesn't have a nice mechanism to restart all your unitsor groups (at least that I found)
73 / 80
etcd is awesome :-)
74 / 80
not quite ready for Mediafly
75 / 80
...but...
76 / 80
I plan on deploying CoreOS to power myside projects, blog, and the handful ofsites I run for friends soon
77 / 80
I feel that after a bit of work this will bethe OS that powers distributed systemsin the future
78 / 80
Questions?
79 / 80
Fin.
80 / 80