COMP9334: Capacity Planning of Computer …cs9334/14s1/WK01/week01B.pdfWhat is capacity planning?...

Post on 15-Mar-2020

0 views 0 download

transcript

COMP9334: Capacity Planning ofComputer Systems and Networks

Week 1-B: Introduction to Capacity Planning

Lecturer: Prof. Sanjay Jha

NETWORKS RESEARCH GROUP, CSE, UNSW

What is capacity planning?

Capacity is the maximum amount of work that a systemcan handle in a given period of time

Capacity can be increased through:

Introducing better techniquesOptimizing resource allocationAdding extra hardware, last means, always

Capacity planning is the process of determining thecapacity needed by a system to meet (changing)workload with guarantee on certain level of performance

Page 1

Why capacity planning?

Discrepancy between capacity and workload results ininefficiency, either in under-utilized resources or poorsystem performance

The aim of capacity planning is to minimize thisdiscrepancy

Page 2

Capacity planning cycle

Determine desired service levels

Measure actual workload

Measure actual performance

Tune as necessary

Create performance models

Predict workload growth

Predict capacity required to maintain service levels

Page 3

Main topics in this course

Workload model

Performance evaluation

Operational analysis, emulation, simulation, queueinganalysis

How to optimize resource allocation?

These techniques are building blocks to capacityplanning of computer systems

Page 4

Importance of performance

Computer systems are pervasive

Mission-critical applicationsLife support applicationsHomeland securityBattlefield situationsPersonal communication systems

Page 5

Performance metrics

Also known as Quality of Service (QoS) metrics

Typical ones are:

Response timeThroughputAvailabilityReliabilityScalability

Page 6

Response time

Measured in seconds, can be expressed as mean,standard deviation, probability distribution, etc

Possible performance specifications:

Mean response time is less than one secondwhen no more than 5000 requests arrive per second95% of the requests are completed within one secondwhen no more than 5000 requests arrive per second

Page 7

Throughput

The rate at which requests are completedE.g. for network routers, throughput can be measured inPackets per Second, MB per Second

Throughput is a function of the loadAssume a disk takes 10 msec on average to perform an I/OoperationMaximum number of I/O operations per second = 100If 50 I/O requests arrive per second, throughput = 50 I/Ooperations per secondIf 110 I/O requests arrive per second, throughput = 100 I/Ooperations per secondSo, throughput = min( offered load, capacity)

Page 8

Throughput (cont.)

Thrashing = congestion collapse

Page 9

Availability

Fraction of time a system is up and available to itscustomers

E.g., it is common for Internet Service Providers (ISP)to sign Service Level Agreement (SLA) with theircommercial customers. One ISP guarantees that itsnetwork outage (downtime) is less than 6 hours per30 days. The network availability is 1 − 6/(30 ∗ 24) =99.17%

Page 10

Reliability

The probability that a system functions properly andcontinuously over a fixed period of time

Related terms MTTF (mean time to failure) - replacefaulty part, MTBF (mean time between failures) - repairfault.

Analogy: Failure of Music System vs Engine for a Car

Page 11

Scalability

How fast does performance degrade with increasingload?

System B is more scalable

Page 12

Why probability

The techniques that we are going to study in this courseare largely based on probability theory, why?

Let us say 500 HTTP requests arrive at the web serverin one second

In a deterministic world, an HTTP request arrives every 2ms

But the arrival pattern in reality is not deterministic, it is random

Page 13

What is a model?

An abstraction of some aspects of the real system

Ignore some of the complexities of the real systemRetain what is essential to meet analysis requirements

E.g. to study performance of a database server,complexity of a magnetic disk simplified to numberof I/O operations per second

Never match the real system exactly, even in theaspects we model

Answer what-if questions about the real system

Avoid costly and time-consuming experiments

Page 14

Analytic model⇔ Simulation model

Analytic models rely on simplifying assumptions to bemathematically tractable

Less detailed, less accurate, but more efficient to runPreferable for capacity planning purposes, especiallyfor optimization

Simulation models are more flexible, can capture moresystem behaviors

More detailed, more accurate, but less efficient to run

Page 15

What is a queue?

Customers arrive at random times

Wait for service, if all servers are busy

Get selected for service, if more than one waiting

Require random service time

Depart when service complete

Page 16

Response time

Response time (R)= Time in waiting (W ) + Time in service (S)

Page 17

How queues can behave?

E.g. web server

Average of 1 msec to handle each requestIf arrival rate is 1000 requests per second, server cancope if arrival uniformQueues build up when variability in arrival pattern

Burst of requests may have to wait in a queueIf queue full, i.e. overflow, drop requestsLarge queue length→ Large response time

Page 18

How queues can behave? (cont.)

Normalised Average Average Averagearrival rate input output queue

0.5 500 500 43

0.95 950 907 1859

0.99 990 942 2583

Arrival rate increase by a factor of 2→ Queue length increase by a factor of 60

Small increase in arrival rate from 0.95 to 0.99→ Queue length increase by 40%

Page 19

Queueing disciplines

Rules that dictate how jobs are buffered while waiting tobe serviced

First come first served (FCFS)

Strictly in the order of arrival, most common one

Priority queueing

Classified according to priorityHighest priority served firstFCFS for those within the same priority classVariations: preemptive⇔ non-preemptive,preemptive resume⇔ preemptive restart

Page 20

Queueing disciplines (cont.)

Round robin

Each job in the queue is served in turn for a shortperiod of timeE.g. given n jobs, the circular order is 1, 2, . . . , n, 1, 2, . . .

Commonly used by operating systems to scheduleCPU to ready processes

Other possible ones: processor sharing, last come firstserved (LCFS), shortest job first, random

Page 21

Queueing models

Pervasively used in study of computer systems

E.g. CPU, disk, router, switch, communications linkEach resource can be modelled as a single queueA request may require service from multiple resourcesin a computer system, e.g. a database server

The whole system can be modelled as a network ofqueues, or queueing network

Page 22

Database server for online transactions

The server has one CPU and one disk

A transaction may visit the CPU and disk multiple timesbefore completed

An open queueing network

Page 23

Multi-class transactions

Why multi-class?

Heterogeneity in service demands, workloads andservice level objectivesModelled by one single class may give inaccurateperformance prediction

Transaction Percentage Average CPU Average Maximum AverageGroup of Total Time (sec) # of I/Os Response Time (sec)

Trivial 45% 0.04 5.5 1.2Medium 25% 0.18 28.9 2.5Complex 30% 1.20 85.0 8.0

Page 24

Database server for batch jobs

Running batch jobs overnight

E.g. producing managerial reports

Assume once a job has completed, a new job starts

Maintain constant number of customers in the system

A closed queueing networkPage 25

Open queueing network

External arrivals

Workload intensity specified by arrival rate

Unbounded number of customers in the system

In equilibrium, flow in = flow out⇒ throughput = arrival rate

Page 26

Closed queueing network

No external arrivals, batch jobs

Workload intensity specified by customer population

Bounded and known number of customers in the system

Throughput is a function of the customer population

Page 27

Mixed queueing network

Service Level AgreementsTransaction Maximum Average MinimumGroup Response Time (sec) Throughput

Trivial 1.2 -Medium 2.5 -Complex 8.0 -Batch Reports - 20 per hour

Page 28

Reference

Textbook Chapters 1 & 2

Page 29