Date post: | 04-Apr-2018 |
Category: |
Documents |
Upload: | tufail-khan |
View: | 220 times |
Download: | 0 times |
of 54
7/29/2019 cpu-sched
1/54
CPU Scheduling
CS 519: Operating System TheoryComputer Science, Rutgers University
Instructor: Thu D. Nguyen
TA: Xiaoyan Li
Spring 2002
7/29/2019 cpu-sched
2/54
2Computer Science, Rutgers CS 519: Operating System Theory
What and Why?
What is processor scheduling?
Why?
At first to share an expensive resource multiprogramming
Now to perform concurrent tasks because processor is sopowerful
Future looks like past + now
Rent-a-computer approach large data/processing centers usemultiprogramming to maximize resource utilization
Systems still powerful enough for each user to run multipleconcurrent tasks
7/29/2019 cpu-sched
3/54
3Computer Science, Rutgers CS 519: Operating System Theory
Assumptions
Pool of jobs contending for the CPU
CPU is a scarce resource
Jobs are independent and compete for resources (this
assumption is not always used)Scheduler mediates between jobs to optimize someperformance criteria
7/29/2019 cpu-sched
4/54
4Computer Science, Rutgers CS 519: Operating System Theory
Types of Scheduling
Were mostly
concerned with
short-term
scheduling
7/29/2019 cpu-sched
5/54
5Computer Science, Rutgers CS 519: Operating System Theory
What Do We Optimize?
System-oriented metrics:
Processor utilization: percentage of time the processor is busy
Throughput: number of processes completed per unit of time
User-oriented metrics:Turnaround time: interval of time between submission andtermination (including any waiting time). Appropriate for batch
jobs
Response time: for interactive jobs, time from the submissionof a request until the response begins to be received
Deadlines: when process completion deadlines are specified,the percentage of deadlines met must be promoted
7/29/2019 cpu-sched
6/54
6Computer Science, Rutgers CS 519: Operating System Theory
Design Space
Two dimensions
Selection function
Which of the ready jobs should be run next?
PreemptionPreemptive: currently running job may be interrupted and moved toReady state
Non-preemptive: once a process is in Running state, it continues toexecute until it terminates or it blocks for I/O or system service
7/29/2019 cpu-sched
7/54
7Computer Science, Rutgers CS 519: Operating System Theory
Job Behavior
7/29/2019 cpu-sched
8/54
8Computer Science, Rutgers CS 519: Operating System Theory
Job Behavior
I/O-bound jobs
Jobs that perform lots of I/O
Tend to have short CPU bursts
CPU-bound jobs
Jobs that perform very littleI/O
Tend to have very long CPUbursts
Distribution tends to be hyper-exponential
Very large number of veryshort CPU bursts
A small number of very longCPU bursts
CPU
Disk
7/29/2019 cpu-sched
9/54
9Computer Science, Rutgers CS 519: Operating System Theory
Histogram of CPU-burst Times
7/29/2019 cpu-sched
10/54
10Computer Science, Rutgers CS 519: Operating System Theory
Example Job Set
Process
Arrival
Time
Service
Time
1
2
3
4
5
0
2
4
6
8
3
6
4
5
2
7/29/2019 cpu-sched
11/54
11Computer Science, Rutgers CS 519: Operating System Theory
Behavior of Scheduling Policies
7/29/2019 cpu-sched
12/54
12Computer Science, Rutgers CS 519: Operating System Theory
Behavior of Scheduling Policies
7/29/2019 cpu-sched
13/54
13Computer Science, Rutgers CS 519: Operating System Theory
Multilevel Queue
Ready queue is partitioned into separate queues:
foreground (interactive)
background (batch)
Each queue has its own scheduling algorithm:
foreground RRbackground FCFS
Scheduling must be done between the queues.
Fixed priority scheduling; i.e., serve all from foreground then frombackground. Possibility of starvation.
Time slice each queue gets a certain amount of CPU time which it canschedule amongst its processes; i.e.,80% to foreground in RR
20% to background in FCFS
7/29/2019 cpu-sched
14/54
14Computer Science, Rutgers CS 519: Operating System Theory
Multilevel Queue Scheduling
7/29/2019 cpu-sched
15/54
15Computer Science, Rutgers CS 519: Operating System Theory
Multilevel Feedback Queue
A process can move between the various queues; agingcan be implemented this way.
Multilevel-feedback-queue scheduler defined by the
following parameters:number of queues
scheduling algorithms for each queue
method used to determine when to upgrade a process
method used to determine when to demote a processmethod used to determine which queue a process will enterwhen that process needs service
7/29/2019 cpu-sched
16/54
16Computer Science, Rutgers CS 519: Operating System Theory
Multilevel Feedback Queues
7/29/2019 cpu-sched
17/54
17Computer Science, Rutgers CS 519: Operating System Theory
Example of Multilevel Feedback Queue
Three queues:
Q0 time quantum 8 milliseconds
Q1 time quantum 16 milliseconds
Q2 FCFSScheduling
A new job enters queue Q0which is servedFCFS. When itgains CPU, job receives 8 milliseconds. If it does not finish in
8 milliseconds, job is moved to queue Q1.At Q1 job is again served FCFS and receives 16 additionalmilliseconds. If it still does not complete, it is preempted andmoved to queue Q2.
7/29/2019 cpu-sched
18/54
18Computer Science, Rutgers CS 519: Operating System Theory
Traditional UNIX Scheduling
Multilevel feedback queues
128 priorities possible (0-127)
1 Round Robin queue per priority
Every scheduling event the scheduler picks the lowestpriority non-empty queue and runs jobs in round-robin
Scheduling events:
Clock interruptProcess does a system call
Process gives up CPU,e.g. to do I/O
7/29/2019 cpu-sched
19/54
19Computer Science, Rutgers CS 519: Operating System Theory
Traditional UNIX Scheduling
All processes assigned a baseline priority based on thetype and current execution status:
swapper 0
waiting for disk 20
waiting for lock 35
user-mode execution 50
At scheduling events, all processs priorities areadjusted based on the amount of CPU used, the current
load, and how long the process has been waiting.Most processes are not running, so lots of computingshortcuts are used when computing new priorities.
7/29/2019 cpu-sched
20/54
20Computer Science, Rutgers CS 519: Operating System Theory
UNIX Priority Calculation
Every 4 clock ticks a processes priority is updated:
The utilization is incremented every clock tick by 1.
The niceFactor allows some control of job priority. Itcan be set from 20 to 20.
Jobs using a lot of CPU increase the priority value.Interactive jobs not using much CPU will return to thebaseline.
NiceFactornutilizatio
BASELINEP 24
7/29/2019 cpu-sched
21/54
21Computer Science, Rutgers CS 519: Operating System Theory
UNIX Priority Calculation
Very long running CPU bound jobs will get stuck at the highestpriority.
Decay function used to weight utilization to recent CPU usage.
A processs utilization at timetis decayed every second:
The system-wide load is the average number of runnable jobsduring last 1 second
niceFactoruload
loadu tt
)1(
)12(
2
7/29/2019 cpu-sched
22/54
22Computer Science, Rutgers CS 519: Operating System Theory
UNIX Priority Decay
1 job on CPU. load will thus be 1. Assume niceFactor is 0.
Compute utilization at time N:
+1 second:
+2 seconds
+N seconds
01
3
2UU
002
2
11
3
2
3
2
3
2
3
2UUUUU
...3
2
3
22
2
1
nn UUU n
7/29/2019 cpu-sched
23/54
23Computer Science, Rutgers CS 519: Operating System Theory
Scheduling Algorithms
FIFO is simple but leads to poor average response times. Shortprocesses are delayed by long processes that arrive before them
RR eliminate this problem, but favors CPU-bound jobs, which havelonger CPU bursts than I/O-bound jobs
SJN, SRT, and HRRN alleviate the problem with FIFO, but requireinformation on the length of each process. This information is notalways available (although it can sometimes be approximated basedon past history or user input)
Feedback is a way of alleviating the problem with FIFO without
information on process length
7/29/2019 cpu-sched
24/54
24Computer Science, Rutgers CS 519: Operating System Theory
Its a Changing World
Assumption about bi-modal workload no longer holds
Interactive continuous media applications are sometimesprocessor-bound but require good response times
New computing model requires more flexibilityHow to match priorities of cooperative jobs, such asclient/server jobs?
How to balance execution between multiple threads of a singleprocess?
7/29/2019 cpu-sched
25/54
25Computer Science, Rutgers CS 519: Operating System Theory
Lottery Scheduling
Randomized resource allocation mechanism
Resource rights are represented by lottery tickets
Have rounds of lottery
In each round, the winning ticket (and therefore thewinner) is chosen at random
The chances of you winning directly depends on thenumber of tickets that you have
P[wining] = t/T, t = your number of tickets, T = total number oftickets
7/29/2019 cpu-sched
26/54
26Computer Science, Rutgers CS 519: Operating System Theory
Lottery Scheduling
After n rounds, your expected number of wins is
E[win] = nP[wining]
The expected number of lotteries that a client must wait beforeits first win
E[wait] = 1/P[wining]
Lottery scheduling implementsproportional-shareresourcemanagement
Ticket currencies allow isolation between users, processes, and
threadsOK, so how do we actually schedule the processor using lotteryscheduling?
7/29/2019 cpu-sched
27/54
27Computer Science, Rutgers CS 519: Operating System Theory
Implementation
7/29/2019 cpu-sched
28/54
28Computer Science, Rutgers CS 519: Operating System Theory
Performance
Allocated and observedexecution ratios between
two tasks running theDhrystone benchmark.With exception of 10:1
allocation ratio, all observedratios are close to allocations
7/29/2019 cpu-sched
29/54
29Computer Science, Rutgers CS 519: Operating System Theory
Short-term Allocation Ratio
7/29/2019 cpu-sched
30/54
30Computer Science, Rutgers CS 519: Operating System Theory
Isolation
Five tasks running the Dhrystonebenchmark. Let amount.currency
denote a ticket allocation of amountdenominated in currency. Tasks
A1 and A2 have allocations 100.A and
200.A, respectively. Tasks B1 and B2have allocations 100.B and 200.B,respectively. Halfway thru experiment
B3 is started with allocation 300.B.This inflates the number of tickets in Bfrom 300 to 600. Theres no effect on
tasks in currency A or on the aggregateiteration ratio of A tasks to B tasks.Tasks B1 and B2 slow to half theiroriginal rates, corresponding to thefactor of 2 inflation caused by B3.
7/29/2019 cpu-sched
31/54
31Computer Science, Rutgers CS 519: Operating System Theory
Borrowed-Virtual-Time (BVT) Scheduling
Current scheduling in general purpose systems does notsupport rapid dispatch of latency-sensitive applications
Examples include continuous media applications such asteleconferencing, playing movies, voice-over-IP, etc.
Whats the problem with the traditional Unix scheduler?
Beauty of BVT is its simplicity
Corollary: not that much to say
Tricky part is figuring out the appropriate parameters of the paper is on this (which Im going to skip)
7/29/2019 cpu-sched
32/54
32Computer Science, Rutgers CS 519: Operating System Theory
BVT Scheduling: Basic Idea
Scheduling is done based on virtual time
Each thread has
EVT (effective virtual time)
AVT (actual virtual time)
W (warp factor)
warpBack (whether warp is on or not)
EVT of thread is computed as
Threads accumulate virtual time as they run
Thread with earliest EVT is scheduled next
)0:?( WwarpBackAE
7/29/2019 cpu-sched
33/54
33Computer Science, Rutgers CS 519: Operating System Theory
BVT Scheduling: Details
Can only switch every Ctime units to prevent thrashing
Threads can accumulate virtual time at different rates
Allow for weighted fair sharing of CPU
To make sure that latency-sensitive threads arescheduled right away, give these threads high warpvalues
Have limits on how much and how long can warp to
prevent abuse
7/29/2019 cpu-sched
34/54
34Computer Science, Rutgers CS 519: Operating System Theory
BVT Scheduling: Performance
7/29/2019 cpu-sched
35/54
35Computer Science, Rutgers CS 519: Operating System Theory
BVT Scheduling: Performance
7/29/2019 cpu-sched
36/54
36Computer Science, Rutgers CS 519: Operating System Theory
BVT Scheduling: Performance
7/29/2019 cpu-sched
37/54
37Computer Science, Rutgers CS 519: Operating System Theory
BVT Scheduling: Performance
7/29/2019 cpu-sched
38/54
38Computer Science, Rutgers CS 519: Operating System Theory
BVT vs. Lottery
How do the two compare?
7/29/2019 cpu-sched
39/54
Parallel Processor Scheduling
7/29/2019 cpu-sched
40/54
40Computer Science, Rutgers CS 519: Operating System Theory
Simulating Ocean Currents
Model as two-dimensional gridsDiscretize in space and time
finer spatial and temporalresolution => greater accuracy
Many different computations pertime step
set up and solve equations
Concurrency across and withingrid computations(a) Cross sections (b) Spatial discretizationof a cross section
7/29/2019 cpu-sched
41/54
41Computer Science, Rutgers CS 519: Operating System Theory
m1m
2
r2
Star on which f orcesare being computed
Star too close toapproximate
Small group far enough away toapproximate to center of mass
Large group farenough away toapproximate
Case Study 2: Simulating Galaxy Evolution
Simulate interactions of many stars evolving over time
Computing forces is expensive
O(n2) brute force approach
Hierarchical Methods take advantage of force law: G
7/29/2019 cpu-sched
42/54
42Computer Science, Rutgers CS 519: Operating System Theory
Case Study 2: Barnes-Hut
Many time steps, plenty of concurrency across stars within eachLocality Goal
Particles close together in space should be on same processor
Difficulties: Nonuniform, dynamically changing
Spatial Domain Quad-tree
C St d 3 R d i S b R
7/29/2019 cpu-sched
43/54
43Computer Science, Rutgers CS 519: Operating System Theory
Case Study 3: Rendering Scenes by RayTracing
Shoot rays into scene throughpixels in projection plane
Result is color for pixel
Rays shot through pixels inprojection plane are calledprimaryrays
Reflect and refract when they hitobjects
Recursive process generates raytree per primary ray
Tradeoffs between execution timeand image quality
Viewpoint
Projection Plane
3D Scene
Ray from
viewpoint to
upper right cornerpixel
Dynamically
generated ray
7/29/2019 cpu-sched
44/54
44Computer Science, Rutgers CS 519: Operating System Theory
Partitioning
Need dynamic assignment
Use contiguous blocks to exploit spatial coherence amongneighboring rays, plus tiles for task stealing
A block,the unit ofassignment
A tile,the unit of decompositionand stealing
7/29/2019 cpu-sched
45/54
45Computer Science, Rutgers CS 519: Operating System Theory
Sample Speedups
h d l ( )
7/29/2019 cpu-sched
46/54
46Computer Science, Rutgers CS 519: Operating System Theory
Coscheduling (Gang)
Cooperating processes may interact frequentlyWhat problem does this lead to?
Fine-grained parallel applications haveprocess workingset
Two things neededIdentify process working set
Coschedule them
Assumption: explicitly identified process working setSome good recent work has shown that it may bepossible to dynamically identify process working set
h d l
7/29/2019 cpu-sched
47/54
47Computer Science, Rutgers CS 519: Operating System Theory
Coscheduling
What is coscheduling?
Coordinating across nodes to make sure that processesbelonging to the same process working set arescheduled simultaneously
How might we do this?
Impact of OS Scheduling Policies and
7/29/2019 cpu-sched
48/54
48Computer Science, Rutgers CS 519: Operating System Theory
Impact of OS Scheduling Policies andSynchronization on Performance
Consider performance for a set of applications for
Feedback priority scheduling
Spinning
Blocking
Spin-and-block
Block-and-hand-off
Block-and-affinity
Gang scheduling (time-sharing coscheduling)
Process control (space-sharing coscheduling)
A li i
7/29/2019 cpu-sched
49/54
49Computer Science, Rutgers CS 519: Operating System Theory
Applications
N l S h d li i h S i i
7/29/2019 cpu-sched
50/54
50Computer Science, Rutgers CS 519: Operating System Theory
Normal Scheduling with Spinning
N l S h d li ith Bl ki L k
7/29/2019 cpu-sched
51/54
51Computer Science, Rutgers CS 519: Operating System Theory
Normal Scheduling with Blocking Locks
G S h d li
7/29/2019 cpu-sched
52/54
52Computer Science, Rutgers CS 519: Operating System Theory
Gang Scheduling
P C t l (S Sh i )
7/29/2019 cpu-sched
53/54
53Computer Science, Rutgers CS 519: Operating System Theory
Process Control (Space-Sharing)
M lti S h d li
7/29/2019 cpu-sched
54/54
Multiprocessor Scheduling
Load sharing: poor locality; poor synchronizationbehavior; simple; good processor utilization. Affinityor per processor queues can improve locality.
Gang scheduling: central control; fragmentation --unnecessary processor idle times (e.g., twoapplications with P/2+1 threads); goodsynchronization behavior; if careful, good locality
Hardware partitions: poor utilization for I/O-intensiveapplications; fragmentation unnecessary processoridle times when partitions left are small; good localityand synchronization behavior