Scheduling CS623, Lecture 7 3/9/2004 © Joel Wein, updated by T. Suel.

Scheduling

CS623, Lecture 7 3/9/2004

© Joel Wein, updated by T. Suel

Reading Materials:

Stallings Textbook, Chapter 9– Background, Fair-Share Scheduler

Carl A. Waldspurger and William E. Weihl. Lottery Scheduling: Flexible Proportional-Share Resource Management, Proc. of the First Symposium on Operating Systems Design and Implementation (OSDI), 1994. C. A. Waldspurger and W. E. Weihl, "Stride Scheduling: Deterministic Proportional-Share Resource Management," Technical Memorandum MIT/LCS/TM-528, Laboratory for Computer Science, MIT 1995. P. Goyal and X. Guo and H.M. Vin, A Hierarchical CPU Scheduler for Multimedia Operating Systems, Proceedings of 2nd Symposium on Operating System Design and Implementation (OSDI), 1996.

Outline

Basics (Stallings 9.3) Fair-Share Scheduler (Stallings 9.3) Lottery Scheduling Stride Scheduling QLinux

Short-Term Scheduler

Medium-Term Scheduler: swapping Short-Term Scheduler: what to execute next Give small slices of time to processes Some other objectives (fairness and others)

Basic Strategies

Priorities FCFS RR SPN (shortest process next) SRPT (shortest remaining processing time) HRRN (stretch) (highest response ratio next) Feedback (penalize old guys)

Fair-Share Scheduling

Traditional techniques treat collection of ready processes as single pool from which to choose the next.– Broken down by priority but otherwise

homogeneous. There might be structure to collection of

processes not recognized by traditional scheduler.– User might want his set of processes to make

progress, not so much one individual one.– Or group of users (department).

Fair Share Strategy

Each user assigned a weighting that defines user’s share of system resources as fraction of total usage of those resources.– If user A has twice the weighting of user B, in

long run should be able to do twice as much work.

– Objective of scheduler: monitor usage and give less resources to those that have more than fair share, more to those that have less.

FSS

G. Henry, Fair-Share Scheduler, 1984. Divide user community into a set of fair-share

groups and allocate fraction of processor resource to each group.

– Each fair share group can be thought of as a proportionally slower than a full system.

Scheduling done on basis of priority: takes into account

– Priority of process (High number is lower priority)– Recent processor usage– Recent processor usage of group it belongs to.

Fair Share Scheduling

See Equations page 420, Stallings. Each process assigned a base priority. Priority of process drops as process uses

processor and as the group to which the process belongs uses the processor.

In case of group utilization, average is normalized by dividing by the weight of the group.

The greater the weight of group, the less its utilization will affect its priority.

FSS

Processor utilization measured as follows:– Process interrupted 60 times per second– During each interrupt, processor usage

field of currently running process is incremented, as is corresponding group processor field.

– Once per second, priorities recalculated.

Lottery Scheduling: Motivation

Policy can have enormous impact on throughput and response time.

“Accurate control over quality of service provided to users and applications requires support for specifying relative computation rates.”

For interactive applications need ability to do this on a short time-frame.

Lottery Scheduling: Problems with Traditional Schedulers

Priority Systems are ad-hoc at best, highest priority always wins.

Fair Share Schedulers: – Relatively coarse control over long-running

computations. – “Algorithms are complex, requiring periodic

usage updates, complicated dynamic priority adjustments, administrative parameter setting to ensure fairness on a time scale of minutes.”

– Priority inversion.

Basics of Lottery Scheduling

Randomized Resource Allocation Mechanism– Resource Rights are represented by lottery

tickets.– Each allocation determined by holding a lottery;

resource granted to client with the winning ticket.

Lottery Scheduling: Resource Rights

Lottery tickets encapsulate resource rights that are abstract, relative and uniform. – Abstract: quantify resource rights independently

of machine details.– Relative: Fraction of resource that they represent

varies dynamically in proportion to contention for that resource.

– Uniform: Rights for heterogeneous resources can be homogeneously represented as tickets.

Lottery Scheduling: Lotteries

How fair is lottery scheduling?– Probabilistically fair. Expected allocation

of resources to clients is proportional to number of tickets that they hold.

– Since scheduling algorithm is randomized, actual allocated proportions not guaranteed to match expceted proportions exactly.

– Over the “long term” disparity decreases.

Lottery Fairness

Number of lotteries won by a client has a binomial distribution.

– Probability of winning for client with t out of T tickets: p = t/T.

– Expected number of wins in n trials = np. Since any client with a non-zero number of tickets

will eventually win a lottery, conventional starvation does not happen.

Also operates fairly when number of clients or tickets varies dynamically.

– For each allocation, any changes in relative ticket allocations immediately reflected in next allocation decision.

Modular Resource Management

Tickets are a useful mechanism for modular resource management.– Use to insulate resource management policies

of independent modules– Can be transferred

Four Techniques;– Transfers– Inflation– Currencies– Compensation tickets

Ticket Transfers

Explicit transfers of tickets from one client to another. Can be used when a client blocks for some

dependency. E.g: Client-Server Example

– Server has no tickets of its own.– Clients give server all of their tickets during RPC.– Server’s priority is the sum of the priorities of all its active

clients.– Server can use lottery scheduling to give preferential service

to high-priority clients. Very elegant solution to long-standing problem.

Transfer

Can be used to solve priority inversion problem in a manner similar to priority inheritance.

Could divide ticket transfers across multiple servers on which they may be waiting.

Ticket Inflation

Client can bump up its priority by printing money.

Only works amongst mutually-trusting clients.

Allows clients to adjust their priority dynamically with zero communication.

Ticket Currencies

Can extend to express resource rights in units that are local to each group of mutually trusting clients.

Unique currency within each trust boundary. Set up an exchange rate with the base

currency. Enables inflation just within a group. Simplifies mini-lotteries, such as for a mutex.

Compensation Tickets

What happens if a thread is I/O-bound and blocks before its quantum expires? – Without adjustment, thread will get less than its

share of the processor.– If you complete fraction f of the quantum, your

tickets are inflated by 1/f until the next time you win.

– Example: If B on average uses 1/5 of a quantum its tickets will be inflated 5x and it will win 5 times as often and get its correct share overall.

Implementation Issues

Need good random number generatorLotteries

– Randomly select a winning ticket, search list of clients for winner

– Optimization: Order by decreasing ticket countsTree data structures

Experimental Evaluation

60 seconds, 2 tasks, diff ticket ratios.– 10:1 gave 13.42:1 relative rate. As ratio increases

randomness less reliable.

Dynamically controlled ticket inflation: competing Monte Carlo simulations with early high errors inflate tickets.

Client Server

Experimental Evaluation

Multimedia Applications:– 3 Mpeg_play video viewers.– 3:2:1 – Results: 1.92:1.5:1– Results distorted by round-robin

processing of client requests by single-threaded X11R5 server.

Use for Synchronization Resources

Contention due to synchronization can substantially affect computation rates.

– Lottery Scheduling can help Extended Mach Cthreads library to support a lottery-

scheduled mutex type.– Associated mutex_currency and inheritance ticket.

All threads that are blocked waiting for mutex perform ticket transfers to fund the mutex currency.

Mutex transfers its inheritance ticket to thread which currently holds mutex.

THUS: Thread which acquires mutex executes with its own funding plus funding of all waiting threads.

Use for Synchronization Resources

This solves the priority inversion problem in which a mutex owner with little funding could execute very slowly due to competition with other threads while a highly funded thread remains blocked on the mutex.

2 minute experiment, 2 groups of threads, 2:1. Got 1.8:1.

Overall, not as fair as we’d like But simple, elegant, OK

Stride Scheduling

Basic Idea: Make a deterministic version of lottery scheduling to reduce short-term variability and improve accuracy.

Implements proportional-share control over processor time and other resources by applying elements of rate-based flow control algorithms designed for networks.

Stride Scheduling

Time quanta, tickets Absolute error: Diff between specified and actual

number of allocations. Pairwise relative error: absolute error for subsystem

containing just those 2. Lottery Scheduling: Expected errors go as sqrt(n). Stride Scheduling: relative error never greater than 1 Absolute error can be O(N) where N is number of

clients.

Stride Scheduling: Basic Algorithm

Mark time virtually using “passes” as the unit as opposed to real seconds.

Compute a representation of the time interval – stride – that a client must wait between successive allocations.

Client with smallest stride will be scheduled most frequently.

A client with half the stride of another will execute twice as quickly.

Stride Scheduling: Basic Algorithm

Each client has three state variables:– Tickets: Num of tickets.– Stride: Inversely proportional to tickets; represents

the interval between selections.– Pass: virtual time index for client’s next selection.

How to Allocate a Resource

Client with minimum pass is selected and its pass is advanced by its stride. If more than one client has the same minimum pass value, then any of them may be selected.

Compensation tickets: increment by f*stride and not stride.

Dynamic Client Participation

This does not support dynamic changes in the number of clients competing for a resource

When clients allowed to leave and join state must be appropriately modified.– Global variables.

Problems

Relative error good. Absolute error: consider 101 clients with

ratio 100:1:…:1– After 100 steps we wanted 50 units for first job

but we got 100. Oops!

Hierarchical Stride Scheduling. Aggregates clients to improve interleaving

Hierarchical Stride Scheduling

Recursive application of basic stride scheduling algorithm.– Individual clients combined into groups with

larger aggregate ticket allocations and correspondingly smaller strides.

– Allocation performed by invoking normal stride scheduling algorithm first among groups and then among individual clients within groups.

– Since often systems consist of small number of high-throughput clients together with a large number of low-throughput clients, helps.

A Hierarchical CPU Scheduler for Multimedia Operating Systems

Consider requirements imposed by various application classes that can co-exist in a multimedia system:– Hard real-time applications (EDF, RMA).– Soft real-time applications. Need to statistically

guarantee QoS parameters such as maximum delay and throughput. E.g. video:

Due to multiple time-scale variations, OS will be required to over-book CPU. This may lead to CPU overload. Need some QoS guarantees.

Can’t assume know requirements up front.

– Best-Effort Applications

Bottom Line

Need different scheduling algorithms for different application classes in a multimedia system.

Need an OS framework that enables different schedulers to be employed for different applications.

Need to guarantee not just coexistence but protection between different classes of applications.

– For example, overbooking of CPU should not violate hard real-time constraints.

Solution

Hierarchical Partitioning of CPU Bandwidth– OS should be able to partition the CPU bandwidth

among various application classes, and each application class should be able to partition its allocation among subclasses or applications.

Hierarchical Partitioning specified by tree. – Each thread belongs to exactly one leaf node– Each node in tree represents either an application

class or an aggregation of application classes.

Threads are scheduled by leaf node dependent schedulers.

Intermediate nodes scheduled by an algorithm that

1. Achieves fair distribution of CPU resource

2. Does not require a priori info about threads’ needs

3. Provides throughput guarantees

4. Computationally efficient.

Qlinux:

QLinux is a Linux kernel that can provide quality of service guarantees. QLinux, based on the Linux 2.2.x kernel, combines some of the latest innovations in operating systems research. It includes the following features:

– Hierarchical Start Time Fair Queuing (H-SFQ) CPU scheduler – Hierarchical Start Time Fair Queuing (H-SFQ) network packet scheduler – Lazy receiver processing (LRP) network subsystem

QLinux

The H-SFQ CPU scheduler enables hierarchical scheduling of applications by fairly allocating cpu bandwidth to individual applications and application classes.

The H-SFQ packet scheduler provides rate guarantees and fair allocation of bandwidth to packets from individual flows as well as flow aggregates (classes). Lazy receiver processing enables accurate charging of TCP/UDP protocol processing overhead (including interrupt processing) to the appropriate process.

The Cello disk scheduler supports multiple application classes such as interactive best-effort, throughput-intensive best effort and soft real-time and fairly allocates disk bandwidth to these classes

Date post:	22-Dec-2015
Category:	Documents
View:	215 times
Download:	0 times

Scheduling CS623, Lecture 7 3/9/2004 © Joel Wein, updated by T. Suel.

Documents