+ All Categories
Home > Technology > Pushing Python: Building a High Throughput, Low Latency System

Pushing Python: Building a High Throughput, Low Latency System

Date post: 25-May-2015
Category:
Upload: kevin-ballard
View: 22,431 times
Download: 0 times
Share this document with a friend
Popular Tags:
31
Kevin Ballard SFpython.org 20140312
Transcript
Page 1: Pushing Python: Building a High Throughput, Low Latency System

Kevin  Ballard  SFpython.org  2014-­‐03-­‐12  

Page 2: Pushing Python: Building a High Throughput, Low Latency System

kevin@  tellapart.com  

Introductions

Page 3: Pushing Python: Building a High Throughput, Low Latency System

Taba • Distributed event aggregation service

import taba ... taba.RecordValue(‘winning_bid_price’, wincpm) ...

$ taba-cli aggregate winning_bid_price {“name”: “winning_bid_price”, “10m”: {“count”: 14709, “total”: 5836.4}, “percentiles”: [0.07 0.16 0.32 0.84 1.33 8.03]}

Page 4: Pushing Python: Building a High Throughput, Low Latency System

Taba +10,000,000  events/sec  

+50,000  metrics  

+1,000  clients  

+100  processors  

Page 5: Pushing Python: Building a High Throughput, Low Latency System

GET THE DATA MODEL RIGHT Lesson #1

Page 6: Pushing Python: Building a High Throughput, Low Latency System

Data Model

Page 7: Pushing Python: Building a High Throughput, Low Latency System

Data Model Event:  (‘bid_cpm’,  ‘Counter’,  time(),  0.233)      State:            Aggregate:  {“10m”:  43.9,  “1h”:  592.22}    

Page 8: Pushing Python: Building a High Throughput, Low Latency System

Data Model

Page 9: Pushing Python: Building a High Throughput, Low Latency System

Data Model

Page 10: Pushing Python: Building a High Throughput, Low Latency System

Data Model

Page 11: Pushing Python: Building a High Throughput, Low Latency System

STATE IS HARD Lesson #2

Page 12: Pushing Python: Building a High Throughput, Low Latency System

Centralizing State

Page 13: Pushing Python: Building a High Throughput, Low Latency System

GENERATORS + GREENLETS = AWESOME

Lesson #3

Page 14: Pushing Python: Building a High Throughput, Low Latency System

Asynchronous Iterator

• JIT processing • Automatically switches through I/O

Page 15: Pushing Python: Building a High Throughput, Low Latency System

CPYTHON SUFFERS FROM MEMORY FRAGMENTATION

Lesson #4

Page 16: Pushing Python: Building a High Throughput, Low Latency System

Fragmentation • Fragmentation is when a process’s heap is

inefficiently used.

• The GC may report a low memory footprint, but the OS reports a much larger RSS.

Page 17: Pushing Python: Building a High Throughput, Low Latency System

Fragmentation

Page 18: Pushing Python: Building a High Throughput, Low Latency System

Fragmentation

Page 19: Pushing Python: Building a High Throughput, Low Latency System

Fragmentation

Page 20: Pushing Python: Building a High Throughput, Low Latency System

Fragmentation

Page 21: Pushing Python: Building a High Throughput, Low Latency System

Fragmentation

Page 22: Pushing Python: Building a High Throughput, Low Latency System

Hybrid Memory Management • Use Cython to allocate page-sized blocks of

pointers into incoming chunk • Hand-off the whole thing to the CPython

memory manager • Whole thing gets deallocated at once

Page 23: Pushing Python: Building a High Throughput, Low Latency System

Hybrid Memory Management

Page 24: Pushing Python: Building a High Throughput, Low Latency System

Hybrid Memory Management

Page 25: Pushing Python: Building a High Throughput, Low Latency System

Hybrid Memory Management

Page 26: Pushing Python: Building a High Throughput, Low Latency System

Ratcheting •  Ratcheting is a pathological case of Fragmentation,

caused by the fact that the heap must be contiguous*:

•  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).

Page 27: Pushing Python: Building a High Throughput, Low Latency System

Ratcheting •  Ratcheting is a pathological case of Fragmentation,

caused by the fact that the heap must be contiguous*:

•  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).

Page 28: Pushing Python: Building a High Throughput, Low Latency System

Ratcheting •  Ratcheting is a pathological case of Fragmentation,

caused by the fact that the heap must be contiguous*:

•  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).

Page 29: Pushing Python: Building a High Throughput, Low Latency System

Ratcheting •  Ratcheting is a pathological case of Fragmentation,

caused by the fact that the heap must be contiguous*:

•  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).

Page 30: Pushing Python: Building a High Throughput, Low Latency System

Ratcheting • Avoid persistent objects • Sockets are common offenders

• Anything that has to be persistent should be created at application startup, before processing data

• Avoid letting the heap grow in the first place

Page 31: Pushing Python: Building a High Throughput, Low Latency System

fin.

github.com/tellapart/taba      

[email protected]      |    @misterkgb    

We’re  Hiring!      tellapart.com/careers  


Recommended