System Design & Scalability

System Design & ScalabilityA quick reference guide

By John DiFini

Overview• Types of Scaling & Load Balancing• Data Storage Design• Message-oriented Middleware (MOM)• Fault Handling• Networking• MapReduce

Types of Scaling & Load Balancing

Horizontal vs. Vertical ScalingType Scaling

ProcessExample Complexit

yLimited

Vertical Add more resources to a single node

Add more memory to a single server

Easy Yes (e.g. can only add so much memory)

Horizontal

Add more nodes

Add more servers

Hard Practically unlimited

Load Balancing• Round Robin• Source IP Hash - a given client IP address will

always go to the same server• Request Hash - a given request type will always go

to the same server/cache; avoids cache duplication

• Least Connections• Least Traffic• Least Latency

Reference

http://serverfault.com/questions/112292/what-kind-of-load-balancing-algorithms-are-there

Data Storage Design

Database - Read vs. Write Performance• Normalize vs. Denormalize

Normalize - ↓duplicate data ⇨ ↑write perf but ↓read perf

Denormalize - ↑duplicate data ⇨ ↑read perf but ↓write perf

• Have your cake and… - Use an append-only structure for writes; then asynchronously restructure data into a read-optimized format[*]

https://www.quora.com/What-are-the-pros-and-cons-of-using-the-Cassandra-database

Database - Structure• Relational - general purpose for tabular/table-

based data• Specialized - for data structures that don't easily

fit the tabular format (e.g. multi-level nesting & hierarchies) NoSQL Others

Not to be confused with...Cache• DB reads are expensive; i.e. hold as

much of it in memory as possible• Cache Hit - data were found in cache;

Cache Miss - data not found, so retrieve it from DB[*]

• Local vs. Distributed Rule of Thumb - use local cache for small data sets, with predictable number of immutable records[*]

• Cache Warming - anticipate queries and "prime" the cache not only on startup but also in real-time (e.g. load surrounding tiles of a recently-requested map)

https://en.wikipedia.org/wiki/Cache_(computing)

https://dzone.com/articles/process-caching-vs-distributed

Cache - Replacement Policy• Replacement Policy - algorithm used to maximize

cache performance by choosing which data to eject & which data to add in its place[*]

LRU - ejects the most Least Recently Used data advanced - considers access frequency, size of

items, latency & throughput

LRUMRUCache:

https://en.wikipedia.org/wiki/Cache_replacement_policies

Data Store ShardingSharding - partition data across multiple nodes

Not to be confused with...

Type Scaling Process Drawback

Table-based Put Table A on Node 1, Table B on Node 2, etc.

What if a table gets too large for its node?

Hash-based Primary key is hashed, and every node is responsible for a range of hashed keys

What happens if the # of nodes changes? -> need to reallocate all the data

Directory-based lookup service keeps track of which data are stored in which shard

What if directory service is down (i.e. single point of failure)?What if directory service has to process to many requests (i.e. a bottleneck)?

Message-oriented Middleware(MOM)

MOM Considerations• Used by distributed systems to communicate

amongst nodes[*]

• Abstracts OS & network intricacies (e.g. endian format, sockets, etc.)

https://en.wikipedia.org/wiki/Message-oriented_middleware

MOM TypesType Use Case Examples Underlying

ProtocolCast

Request/Response

1 sender; 1 receiver (point-to-point)

e.g. Stock Trade Order[*]

Synchronous - JSON Web Services

Asynchronous - message queues like ActiveMQ, IBM MQ

TCP ("guaranteed" delivery)

Unicast

Publish/Subscribe

1 sender; many receivers/listeners

e.g. Stock Tick

Kafka[*], TIBCO Rendezvous/RV

UDP Broadcast (all nodes) or Multicast (node groups)

http://stackoverflow.com/questions/9871136/how-is-tibco-rv-used-in-financial-software

https://www.quora.com/How-is-Kafka-different-from-typical-JMS-message-brokers-like-IBM-MQ-Active-MQ-etc

Fault Handling

Fault Handling• High Availability (HA) - delayed recovery to

secondary • Fault Tolerant - immediate recovery

Active/Passive - primary fails over to secondary

Active/Active - no primary vs. secondary; when 1 fails, the other(s) takes the additional load

What is deadshould never die

• Great YouTube video on the subject!• @todo - explain no-special-node, ring topologies

https://youtu.be/bNeZYVIfskc

Networking

Network Metrics• Bandwidth - The maximum amount of data that

can be transferred in a unit of time (e.g. 100Mbps)[*]

• Throughput - The actual amount of data that is transferred in a unit of time (e.g. 88MBps)

• Latency - The time it takes to send & receive (round-trip) a packet of data (e.g. 20ms)[*]

https://www.amazon.com/Cracking-Coding-Interview-Programming-Questions/dp/0984782850/ref=sr_1_1?s=books&ie=UTF8&qid=1486700171&sr=1-1&keywords=cracking+the+coding+interview

http://whatis.techtarget.com/definition/latency

Network Metrics - AnalogyGiven a water pipe, its diameter determines its throughput, and its length determines its latency. Therefore, to improve:• Throughput - Get a fatter pipe• Latency - Colocate to reduce distance or reduce

network hops (point-to-point), which also reduces distance that data have to travel

MapReduce

MapReduceUses parallel & distributed systems to process large data sets[*]

• Implementations - Spark, Hadoop, etc.[*]

• YouTube presentation

https://en.wikipedia.org/wiki/MapReduce

http://www.infoworld.com/article/3014440/big-data/five-things-you-need-to-know-about-hadoop-v-apache-spark.html

https://www.youtube.com/watch?v=zKbds9ZPjLE&list=PLmn3SmVFei-HTXOYp2LHZWZfiQ9j7XYnJ

MapReduce - StepsFundamentally, consists of two steps, Map & Reduce, but Shuffle step is also prevalent:• Map - Organizes/filters/sorts. Think of putting elements

into a typical Map Interface with key-value pairs (e.g. <key, value>)

• Shuffle - Redistributes data so that all data pertaining to a given key reside on reside on the same node

• Reduce - Summary/aggregation (e.g. sum all values for a given key)

Coming Soon

Coming SoonDefine P9s

templates

color palette

section template

bullet templateasdfasdf

asdfasdf

asdfasdf

https://en.wikipedia.org/wiki/Cache_(computing)

Date post:	13-Apr-2017
Category:	Technology
Upload:	john-difini
View:	83 times
Download:	2 times

System Design & Scalability

Technology