Chapter 6: Scheduling
Start of Lecture: February 12, 2014
���1
Chapter 6: Scheduling
Reminders
• I’ll be away on Friday; Ankush Roy is going to go through select_server.c, important for server_s.c in Assignment 2
• Office hours moved to 1000-1200 today
• Advice: the assignment might feel overwhelming, so
• to do your forking server, look at server.c in example code; start by adding a section to the forked child that reads anything from a client
• the threading server is similar to the forking server
• hint: select_server.c has much of the functionality you need, and useful separation/definitions, like struct con; it just doesn’t parse HTTP requests
• Any questions or comments?
���2
Chapter 6: Scheduling
Quick refresh of server.c
• Recall server.c just sends a string to the client; we could quickly modify it to also do a read from the client
• Will have to add utilities for correctly reading GET strings, parsing important information; but server.c has some of the functionality you need, and select_server a lot of the functionality you need
���3
Chapter 6: Scheduling
Scheduling so far
• First-Come-First Serve and Round Robin
• Shortest Job First and some ways to predict CPU bursts
• and the joys of simple exponential averaging and running average algorithms
• Priority scheduling with priority queues (min or max heaps)
• Issues with different CPU scheduling algorithms and having to pick just one on a mixed-job system
• leading to the idea of multi-level queues
���4
Chapter 6: Scheduling
Linux Example: Completely Fair Scheduler
• Scheduling classes: uses multi-level feedback queues to separately schedule different types of processes
• CFS tries to optimize for a targeted latency requirement, where the targeted latency is an interval of time during which every runnable task should run at least once
• targeted latency has default and minimum values, but is also adaptive
• Uses a priority scheduling approach, where low priority numbers correspond to high priority
• uses red-black tree to store and choose processes
���5
Chapter 6: Scheduling
Linux Example: Completely Fair Scheduler
• Priority of a process dependent on an external priority and an internal priority
• Externally set priorities (user level priority): “nice” values
• Measure for internally setting priority: physical runtime
• Priority value called virtual runtime can correspond to the actual physical runtime, but is generally higher or lower depending on “nice” value
• task with normal priority has virtual runtime = physical runtime
• task with high priority has virtual runtime < physical runtime
• task with low priority has virtual runtime > physical runtime
���6
Chapter 6: Scheduling
A little bit more about Linux priorities
• “Nice” values range from -20 to +19
• seems weird and arbitrary, but can you think of a reason this might be the case, since we use these values to compute vruntime?
• Because vruntime = (physical runtime) * weight(nice) where weight(nice) = 1.25^(nice)
• For nice = -1 (higher priority), weight = 0.8
• For nice = 1 (lower priority), weight = 1.25
• For nice = 0 (normal priority), weight = 1 (i.e. vruntime = physical runtime)
���7
Chapter 6: Scheduling
Red-Black Trees
• Balanced binary search trees enable efficient O(log n) deletion, insertion and search for lists; but, finding algorithms that maintain balance BSTs has proved challenging
• Red-Black Trees: almost balanced binary search trees, enough that can guarantee O(log n) insertion, deletion, search
• Kernel keeps an ordered list of processes using red-black trees, with highest priority process at front of list
���8
Chapter 6: Scheduling
Linux Scheduling Red-Black Tree
���9
Chapter 6: Scheduling
Why Red-Black Trees and not Priority Queues?
• Priority queues (heaps) are efficient for removing the highest priority process, but otherwise, are not efficient for searching for other processes.
• Red-Black trees keep an ordered list, so can quickly find an element in the list of processes (unlike heap)
• more flexible to have ordered list for updating
• Red-Black trees not as efficient for removing highest-priority process, but much better for accessing processes other than just the highest priority one
• in practice, Linux caches the leftmost (i.e. highest priority) location, so the look-up ends up being O(1) rather than O(log n)
���10
Chapter 6: Scheduling
CFS struct hierarchy
���11
Chapter 6: Scheduling
struct sched_entity
• Scheduling entities were introduced to implement group scheduling; this way, CFS provides fair CPU time for individual tasks AND groups of tasks
• scheduling groups are a version of multi-level queues, where each group has own queue (creating a hierarchy of scheduling)
• different scheduling groups might get more CPU time, but within a group, CFS ensure that the processes get fair CPU time
• Let’s look at struct sched_entity for fun!
• grep -nR “struct sched_entity {“ linux_folder
���12
Chapter 6: Scheduling
Video Break: brought to you by another amazing classmate
���13
Chapter 6: Scheduling
Midterm FeedbackI want you to learn as much as possible (course is for you to learn, despite seemingly inconsistent message that grading is all that counts). So feedback = I can better tailor and improve the course
���14
Chapter 6: Scheduling
Multi-Processor Scheduling
• What do you think are some issues for scheduling on multiple processors?
• Should they share a ready queue? Have their own queues?
• Do multi-processor systems versus multicore processors have to be treated differently?
• Load balancing between processors — processes migrate between processors to keep workload evenly distributed
• Should you use pull migration (idle CPU pulls a task from elsewhere) or push migration (specific task periodically checks load and moves processes)
• Processor affinity — process wants to stay on same processor because recent variables all stored in cache; has to repopulate on new processor
���15
Chapter 6: Scheduling
Multiprocessor Scheduling is an NP-complete problem
• Assume you are given
1. a set of n jobs, where job ji has length li and
2. m identical processors
• Decision problem: does exist a scheduling of the n jobs on m processors that completes before deadline D?
• Optimality problem: what is the minimum possible time required to schedule all n jobs on m processors?
• Load balancing decision problem known to be NP-complete; load balancing optimality problem NP-hard
���16
Chapter 6: Scheduling
What is NP-complete again?
• Recall: NP means Non-deterministic polynomial time
• can be solved in polynomial time on nondeterministic Turing machine
• A problem is considered to be in NP if a solution can be verified in polynomial time
• e.g. can check, in polynomial time, if a given scheduling of n jobs completes before deadline D (i.e. verify that its a correct solution)
• NP-complete: if you could solve this problem in polynomial time then could solve all problems in NP in polynomial time
• A problem H is NP-hard if-and-only-if there is an NP-complete problem L reducible to H (i.e. the solution for H can be used to solve L); H need not be in NP
���17
Chapter 6: Scheduling ���18
Chapter 6: Scheduling
Who cares about NP-hardness?
• NP-hardness is not just a theoretical construct; good to know hardness of problem you are facing so that
• you don’t waste time trying to find an optimal algorithm
• you can look into appropriate approximation algorithms
• Approximation algorithms are polynomial-time algorithms that give a suboptimal solution to a NP-complete problem
• Approximation algorithms have a factor specifying how suboptimal their solution is to the original problem
• e.g. a version of the travelling salesman problem that is NP-complete has an approximation algorithm with factor 8/7 (so pretty close)
���19
Chapter 6: Scheduling
Approximation algorithms for Multiprocessor Scheduling
• Assume have n jobs with their runtimes, m processors; find non-preemptive schedule with minimum finish time
• Approximation ratio: (approximation algorithms scheduling finish time)/(optimal schedules finish runtime)
• Simplest approx: greedy solution. Iterate through list of processes, assign to processor that currently has lightest load
• This is a 2-approximation (2x longer runtime than optimal)
• Better approx: longest processing time (LPT). Sort jobs in descending order of processing time, then do greedy solution
• This is a 4/3-approximation (1.3x longer); getting to be reasonable
• What are the runtimes of the above algorithms?
���20
Chapter 6: Scheduling
Approximation algorithms for Multiprocessor Scheduling
• Recall: assume have n jobs with their runtimes, m processors; find non-preemptive schedule with minimum finish time
• Greedy solution: 2-approximation
• LPT solution: 4/3-approximation
• Another algorithm*: < nm / (nm - m +1)-approximation
• e.g. n = 10, m = 3: ratio = 1.07
• e.g. n = 100, m = 3: ratio ~= 1.00 (almost perfect)
• Could be useful for batch systems, particularly if have a reasonable way to compute job length
���21
*A linear time approximation algorithm for multiprocessor scheduling. G Finn, E Horowitz, 1976