COMP9334: Capacity Planning ofComputer Systems and Networks
Week 1-B: Introduction to Capacity Planning
Lecturer: Prof. Sanjay Jha
NETWORKS RESEARCH GROUP, CSE, UNSW
What is capacity planning?
Capacity is the maximum amount of work that a systemcan handle in a given period of time
Capacity can be increased through:
Introducing better techniquesOptimizing resource allocationAdding extra hardware, last means, always
Capacity planning is the process of determining thecapacity needed by a system to meet (changing)workload with guarantee on certain level of performance
Page 1
Why capacity planning?
Discrepancy between capacity and workload results ininefficiency, either in under-utilized resources or poorsystem performance
The aim of capacity planning is to minimize thisdiscrepancy
Page 2
Capacity planning cycle
Determine desired service levels
Measure actual workload
Measure actual performance
Tune as necessary
Create performance models
Predict workload growth
Predict capacity required to maintain service levels
Page 3
Main topics in this course
Workload model
Performance evaluation
Operational analysis, emulation, simulation, queueinganalysis
How to optimize resource allocation?
These techniques are building blocks to capacityplanning of computer systems
Page 4
Importance of performance
Computer systems are pervasive
Mission-critical applicationsLife support applicationsHomeland securityBattlefield situationsPersonal communication systems
Page 5
Performance metrics
Also known as Quality of Service (QoS) metrics
Typical ones are:
Response timeThroughputAvailabilityReliabilityScalability
Page 6
Response time
Measured in seconds, can be expressed as mean,standard deviation, probability distribution, etc
Possible performance specifications:
Mean response time is less than one secondwhen no more than 5000 requests arrive per second95% of the requests are completed within one secondwhen no more than 5000 requests arrive per second
Page 7
Throughput
The rate at which requests are completedE.g. for network routers, throughput can be measured inPackets per Second, MB per Second
Throughput is a function of the loadAssume a disk takes 10 msec on average to perform an I/OoperationMaximum number of I/O operations per second = 100If 50 I/O requests arrive per second, throughput = 50 I/Ooperations per secondIf 110 I/O requests arrive per second, throughput = 100 I/Ooperations per secondSo, throughput = min( offered load, capacity)
Page 8
Throughput (cont.)
Thrashing = congestion collapse
Page 9
Availability
Fraction of time a system is up and available to itscustomers
E.g., it is common for Internet Service Providers (ISP)to sign Service Level Agreement (SLA) with theircommercial customers. One ISP guarantees that itsnetwork outage (downtime) is less than 6 hours per30 days. The network availability is 1 − 6/(30 ∗ 24) =99.17%
Page 10
Reliability
The probability that a system functions properly andcontinuously over a fixed period of time
Related terms MTTF (mean time to failure) - replacefaulty part, MTBF (mean time between failures) - repairfault.
Analogy: Failure of Music System vs Engine for a Car
Page 11
Scalability
How fast does performance degrade with increasingload?
System B is more scalable
Page 12
Why probability
The techniques that we are going to study in this courseare largely based on probability theory, why?
Let us say 500 HTTP requests arrive at the web serverin one second
In a deterministic world, an HTTP request arrives every 2ms
But the arrival pattern in reality is not deterministic, it is random
Page 13
What is a model?
An abstraction of some aspects of the real system
Ignore some of the complexities of the real systemRetain what is essential to meet analysis requirements
E.g. to study performance of a database server,complexity of a magnetic disk simplified to numberof I/O operations per second
Never match the real system exactly, even in theaspects we model
Answer what-if questions about the real system
Avoid costly and time-consuming experiments
Page 14
Analytic model⇔ Simulation model
Analytic models rely on simplifying assumptions to bemathematically tractable
Less detailed, less accurate, but more efficient to runPreferable for capacity planning purposes, especiallyfor optimization
Simulation models are more flexible, can capture moresystem behaviors
More detailed, more accurate, but less efficient to run
Page 15
What is a queue?
Customers arrive at random times
Wait for service, if all servers are busy
Get selected for service, if more than one waiting
Require random service time
Depart when service complete
Page 16
Response time
Response time (R)= Time in waiting (W ) + Time in service (S)
Page 17
How queues can behave?
E.g. web server
Average of 1 msec to handle each requestIf arrival rate is 1000 requests per second, server cancope if arrival uniformQueues build up when variability in arrival pattern
Burst of requests may have to wait in a queueIf queue full, i.e. overflow, drop requestsLarge queue length→ Large response time
Page 18
How queues can behave? (cont.)
Normalised Average Average Averagearrival rate input output queue
0.5 500 500 43
0.95 950 907 1859
0.99 990 942 2583
Arrival rate increase by a factor of 2→ Queue length increase by a factor of 60
Small increase in arrival rate from 0.95 to 0.99→ Queue length increase by 40%
Page 19
Queueing disciplines
Rules that dictate how jobs are buffered while waiting tobe serviced
First come first served (FCFS)
Strictly in the order of arrival, most common one
Priority queueing
Classified according to priorityHighest priority served firstFCFS for those within the same priority classVariations: preemptive⇔ non-preemptive,preemptive resume⇔ preemptive restart
Page 20
Queueing disciplines (cont.)
Round robin
Each job in the queue is served in turn for a shortperiod of timeE.g. given n jobs, the circular order is 1, 2, . . . , n, 1, 2, . . .
Commonly used by operating systems to scheduleCPU to ready processes
Other possible ones: processor sharing, last come firstserved (LCFS), shortest job first, random
Page 21
Queueing models
Pervasively used in study of computer systems
E.g. CPU, disk, router, switch, communications linkEach resource can be modelled as a single queueA request may require service from multiple resourcesin a computer system, e.g. a database server
The whole system can be modelled as a network ofqueues, or queueing network
Page 22
Database server for online transactions
The server has one CPU and one disk
A transaction may visit the CPU and disk multiple timesbefore completed
An open queueing network
Page 23
Multi-class transactions
Why multi-class?
Heterogeneity in service demands, workloads andservice level objectivesModelled by one single class may give inaccurateperformance prediction
Transaction Percentage Average CPU Average Maximum AverageGroup of Total Time (sec) # of I/Os Response Time (sec)
Trivial 45% 0.04 5.5 1.2Medium 25% 0.18 28.9 2.5Complex 30% 1.20 85.0 8.0
Page 24
Database server for batch jobs
Running batch jobs overnight
E.g. producing managerial reports
Assume once a job has completed, a new job starts
Maintain constant number of customers in the system
A closed queueing networkPage 25
Open queueing network
External arrivals
Workload intensity specified by arrival rate
Unbounded number of customers in the system
In equilibrium, flow in = flow out⇒ throughput = arrival rate
Page 26
Closed queueing network
No external arrivals, batch jobs
Workload intensity specified by customer population
Bounded and known number of customers in the system
Throughput is a function of the customer population
Page 27
Mixed queueing network
Service Level AgreementsTransaction Maximum Average MinimumGroup Response Time (sec) Throughput
Trivial 1.2 -Medium 2.5 -Complex 8.0 -Batch Reports - 20 per hour
Page 28
Reference
Textbook Chapters 1 & 2
Page 29