Date post: | 16-Jan-2017 |
Category: |
Technology |
Upload: | thomas-jackson |
View: | 743 times |
Download: | 1 times |
Salt Transport Modularity and Concurrency for Performance and
ScaleThomas JacksonStaff Site Reliability EngineerLinkedIn
3
Agenda
• for item in (‘transport’, ‘concurrency’):• History• Problems• Options• Solution
4
Transport in SaltSalt Transport: a history
• In the beginning Salt was primarily a remote execution engine• Send jobs from Master to N minions (defined by some target)
• In the beginning there was
5
"ZeroMQ (also spelled ØMQ, 0MQ or ZMQ) is a high-performance asynchronous messaging library, aimed at use in
distributed or concurrent applications.”
- Wikipedia (https://en.wikipedia.org/wiki/ZeroMQ)
6
We took a normal TCP socket, injected it with a mix of radioactive isotopes stolen from a secret Soviet atomic
research project, bombarded it with 1950-era cosmic rays, and put it into the hands of a drug-addled comic book
author with a badly-disguised fetish for bulging muscles clad in spandex. Yes, ZeroMQ sockets are the world-saving
superheroes of the networking world.
- http://zguide.zeromq.org/page:all#How-It-Began
7
Salt Transport: a history How ZMQ PUB/SUB looks
Servercontext = zmq.Context()socket = context.socket(zmq.PUB)socket.bind("tcp://*:12345")socket.send(”Message")
Clientcontext = zmq.Context()socket = context.socket(zmq.SUB)socket.connect("tcp://localhost:12345")print socket.recv()
8
Salt Transport: a history How ZMQ REQ/REP looks
Servercontext = zmq.Context()socket = context.socket(zmq.REP)socket.bind("tcp://*:12345")message = socket.recv()socket.send(“got message”)
Clientcontext = zmq.Context()socket = context.socket(zmq.REQ)socket.connect("tcp://localhost:12345")socket.send("Hello”)message = socket.recv()
9
Request lifecycleSalt Transport: a history
Master Minion
1. Jobpublish2. Sign-in(optional–potentiallyreusedorcached)3. PillarFetch4. SLS/filefetch(optional)5. Return
10
Initial ZeroMQ implementationSalt Transport: a history
• Master-initiated messages• Using the pub/sub socket pair in zmq• All broadcast messages from the master to the minion
• Minion-initiated messages• Using the req/rep socket pair in zmq• All messages initiated by the minion, such as:
• Sign-in• Job return• Module sync• Pillar• Etc.
11
Initial problemsSalt Transport: a history
• Message loss• Broadcasts where filtered client side
• Added zmq filtering: https://github.com/saltstack/salt/pull/13285
• Etc.
12
13
Larger problemsSalt Transport: a history
• Huge ZMQ publisher memory leak (https://github.com/zeromq/libzmq/issues/954)• Workaround: Process manager in salt
• No concept of client state• When messages arrive, there is no way to see if the client is still connected– which leads to auth storms• Workaround: Exponential backoff on the minion side
• No sync "connect" (https://github.com/saltstack/salt/pull/21570)• Workaround: fire event and wait for it to return (or timeout to expire)
• Some users have issues with the LGPL license • Workaround: n/a
15
The Reliable Asynchronous Event Transport, or RAET, is an alternative transport medium
developed specifically with Salt in mind. It has been developed to allow queuing to happen up on the application layer and comes with socket layer encryption. It also abstracts a great deal of control over the socket layer and makes it
easy to bubble up errors and exceptions.
- docs.saltstack.com
Salt Transport: previous attempt
16
RAETSalt Transport: previous attempt
• The good• No ZMQ!
• The bad• Effectively a re-implementation of the daemons (separate files, etc.)• Unable to run zmq and RAET simultaneously (initially, hydra was added later – which just runs both daemons at once)
• The different• Changed the model from “minions always connect” to “minions are listening”, meaning minions have a socket to
attack
17
18
What do we really needSalt Transport: back to basics
• Salt is a platform, not a specific transport– we need transports to be modular• Some requirements:
• Simple interface to implement (such that other modules can be written)• Test coverage (including pre-canned tests for new modules)• Support N transports simultaneously (for ramps, and complex infra)• Clear contract of security/privacy requirements of various methods
19
• ReqChannel: minion to master messagesSalt Transport: Channels!
• Master• pre_fork(self, process_manager)• post_fork(self, payload_handler, io_loop)
• Minion• send(self, load, tries=3, timeout=60)• crypted_transfer_decode_dictentry(self, load, dictkey=None, tries=3, timeout=60)
20
• PubChannel: broadcasts to the appropriate minionsSalt Transport: Channels!
• Master• pre_fork(self, process_manager)• publish(self, load)
• Minion:• on_recv(self, callback)
21
ResponsibilitiesSalt Transport: Channels!
• Serialization• Encryption• Targeting (pub channel only)
22
TCP channelSalt Transport: Channels!
• Wire protocol: msgpack({'head': SOMEHEADER, 'body': SOMEBODY})• Main advantages over ZMQ? better failure modes
• Faster failure detection (if minion isn’t connected to the master, you don’t have to wait for the timeouts)• True link-status (no more auth storms!)• Basically, we have sockets again!
• https://docs.saltstack.com/en/develop/topics/transports/tcp.html
23
TCP: How does it look?Salt Transport: Channels!
async_channel = salt.transport.client.AsyncReqChannel.factory(minion_opts)ret = yield async_channel.send(msg)
24
TCP: How accurate?Salt Transport: Channels!
• ZeroMQ• Total jobs: 1000• Completed jobs: 171• Hit rate: 17.1%
• TCP• Total jobs: 1000• Completed jobs: 1000• Hit rate: 100%
25
TCP: How does it performSalt Transport: Channels!
• 15 byte message• ZeroMQ*
• Average time: 0.00295809405715• QPS: 2246.952241147
• TCP• Average time: 0.0023341544863• QPS: 2580.04452801
26
TCP: How does it performSalt Transport: Channels!
• 1053 byte message• ZeroMQ*
• Average time: 0.00278297542184• QPS: 2489.300394919
• TCP• Average time: 0.00251070397869• QPS: 2602.4855051
27
Awesome!Salt Transport: Channels!
• Definitely awesome! • But async? What was that about? • Before we get into specifics, lets talk about concurrency
28
The General ProblemConcurrency
We have lots of things to do, some of which are blocking calls to remote things which are “slow”. It is more efficient (and overall “faster”) to work on something else while we wait for that “slow” call.
29
30
Current state of concurrency in SaltConcurrency
• Master-side: the master creates N Mworkers to process N requests in parallel• N Mworkers to process N requests in parallel• Interaces with non-blocking as well, using `while True:` loops to do timeouts etc.
• Minion-side:• Threads used in MultiMaster for managing the multiple master connections
31
ProblemsConcurrency
• No unified approach (multiprocessing, threading, nonblocking “loops” -- all in use)• Slow and/or blocking operations hold process/thread while waiting• No consistent use of non-blocking libraries, so the code is a mix of loops and
blocking calls• Limited scalability (each approach scales differently)
32
Common solutions in PythonConcurrency
• Threading• Multiprocessing• User-space “threads”: Coroutines / stackless threads
33
Concurrency Threading
• Some isolation between threads• Pre-emptive scheduling
Import threading
def handle_request():
ret = requests.get(‘http://slowthing/’)
# do something else
threads = []
for x in xrange(0, NUM)REQUESTS):
t = threading.Thread(target=handle_request)
t.start()
threads.append(t)
for t in threads:
t.join()
34
Concurrency Multiprocessing
• Complete isolation• Pre-emptive scheduling
Import multiprocessing
def handle():
ret = requests.get(‘http://slowthing/’)
# do something else
Processes = []
for x in xrange(0, NUM)REQUESTS):
p = multiprocessing.Process(target=handle)
p.start()
processes.append(p)
For p in processes:
p.join()
35
• User-space “threads”: Coroutines / stackless threadsConcurrency
• Some libraries you may have heard of• gevent• Stackless python• Greenlet• Twisted• Tornado
• How are these implemented• Green threads• callbacks• coroutines
36
Why Coroutines?Concurrency
• Coroutines have been in use in python for a while (tornado)• The new asyncio in python3 (tulip) is coroutines
(https://docs.python.org/3/library/asyncio.html)
37
Coroutines are computer program components that generalize subroutines for
nonpreemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations.
- https://en.wikipedia.org/wiki/Coroutine
Concurrency
38
Concurrency Coroutines– what is this magic?
def item_of_work():
while True:
input = yield
yield do_something(input)
39
Concurrency Coroutines– what is this magic?
def some_complex_handle():
while True:
input = yield
out1 = do_something(input)
yield None
out2 = do_something2(out1)
yield None
return do_something3(out2)
40
Concurrency Tornado coroutines
• Some isolation between coroutines• Explicit yield• Light “threads”
Import threading
@tornado.gen.coroutine
def handle_request():
ret = yield requests.get(‘http://slow/’)
# do something else
loop = tornado.ioloop.IOLoop.current()
loop.spawn_callback(handle_request)
loop.start()
41
Coroutines– futuresConcurrency
• Futures are just objects that represent a thing that will complete in the future• This allows methods to return immediately, but finish the task in the future• This allows the callers to yield execution until the futures they depend on complete
42
Concurrency Coroutines– with futures
• Yield execution, and get returns• Method looks fairly normal• Stack traces in here have context• Easy chaining of futures
@tornado.gen.coroutine
def some_complex_handle(request):
a = yield is_authd(request)
if not a:
return False
ret = yield do_request(request)
yield save1(ret), save2(ret)
return ret
43
Tornado in SaltConcurrency
• What is tornado?• Python web framework and asynchronous networking library
• Why Tornado and not asyncio?• Free python 2.x compatibility!• A fairly comprehensive set of libraries for it (http, locks, queues, etc.)
44
Back to the transport interfacesConcurrency
• AsyncReqChannel• send: return a future• crypted_transfer_decode_dictentry: return a future
ret = yield channel.send(load, timeout=timeout)
45
Now what?Concurrency
• Now that we have a real concurrency model, what have we done with it?• MultiMinion in a single process (coroutine per connection)• Easily implement concurrent networking within Salt
• TCP transport• IPC
46
47
Really? Problems?Concurrency problems
• Most common pitfalls to concurrent programming• race conditions and memory collisions• deadlocks
48
Race conditionsConcurrency problems
• Weird data problems in the reactor: https://github.com/saltstack/salt/issues/23373• The underlying problem: injected stuff in modules (__salt__ etc.) were just dicts—
which aren’t threadsafe (or coroutinesafe!)
• The solution? `ContextDict`
49
Copy-on-write thread/coroutine specific dictContextDict
• Works just like a dict• Exposes a clone() method, which creates a `ChildContextDict` which is a
thread/coroutine local copy• With tornado’s StackContext, we switch the backing dict of the parent with your
child using a context manager
cd = ContextDict(foo=bar)print cd[‘foo’] # will be barwith tornado.stack_context.StackContext(cd.clone): print cd[‘foo’] # will be bar cd[‘foo’] = ‘baz’ print cd[‘foo’] # will be bazprint cd[‘foo’] # will be bar
More examples: https://github.com/saltstack/salt/blob/develop/tests/unit/context_test.py
50
DeadlocksConcurrency problems
• haven't seen any yet *knock on wood* -- in general we avoid these since each coroutine is more-or-less independent of the others
51
Layers!Concurrency problems
• Don’t forget, concurrency at all layers– including your DC-wide state execution• For example: automated highstate enforcement of your whole DC
• Does it matter if all DB hosts update at once?• Does it matter if all web servers update at once?• Does it matter if all edge boxes update at once?
52
concurrency controls for state executionzk_concurrency
acquire_lock: zk_concurrency.lock: - name: /trafficeserver - zk_hosts: 'zookeeper:2181' - max_concurrency: 4 - prereq: - service: trafficservertrafficserver: service.running: []release_lock: zk_concurrency.unlock: - name: /trafficserver - require: - service: trafficserver
53
Things on my “list”Future Awesomeness
• Transport• failover groups• even better HA (https://github.com/saltstack/salt/issues/25700 -- get involved in the conversation)
• Concurrency• async ext_pillar• Partially concurrent state execution (prefetch, etc.)?• Coroutine-based:
• Reactor• Engines• Beacons• Thorium
©2014 LinkedIn Corporation. All Rights Reserved.©2014 LinkedIn Corporation. All Rights Reserved.