BFS by Con Kolivas
Guruprasad Aphale.Real Time Lunch, 10/21/2009
1
Guruprasad Aphale.
Outline• What is BFS?• CFS vs. BFS• Design of BFS
▫Design Details▫Design Decisions
• Performance Evaluation▫Comparison with CFS
• Questions??
2
Guruprasad Aphale.
Why BFS?? [1]
3
Guruprasad Aphale.
What is BFS?• A “Fair” Scheduler by Con Kolivas.• Simple, Basic Design• Forward Looking Only• Time slice independent of sleep time
▫ Fixed Time slice length• Desktop Oriented Scheduler
▫Meant for low spec machines▫Not supposed to be scaled for large hardware▫NUMA unaware
4
Guruprasad Aphale.
Feature CFS BFS
Scheduling Complexity
O(log n) O(n)
Ideal For Big Machines Desktop to mid-size (No. of Logical CPUs < 16)
Tunable Knobs 8 [2] 1
Run Queues Multiple Single
Run Queue Implementation
Red Black Trees Doubly Linked Lists
Design Complexity Complex Simple, Basic
Code Complexity Complex Less than CFS (Number of lines also reduced by more than 4000)
Comparison of CFS and BFS
5
Guruprasad Aphale.
Design Details
•Virtual Deadline•Task Insertions•Task Selection•Locking
6
Guruprasad Aphale.
Virtual Deadline• “round robin interval” (rr_interval)
▫Length of time slice in ms▫Configurable▫Modified based on number of CPUs
rr_interval *= 1 + lg(num_online_cpus)• When a task is created, it is given a time slice
and a virtual deadline• Virtual Deadline = jiffies + (user_priority *
rr_interval)• user_priority : nice value + 20 (in range
0..39)
7
Guruprasad Aphale.
Virtual Deadline Continued..
•The deadline is a virtual one ▫No guarantee that a task will be scheduled
by this time.▫Used to compare which task should go next
•Events causing rescheduling▫time_slice Exhausted▫sleep▫preemption
8
Guruprasad Aphale.
Virtual Deadline Continued…
•After swapping a task out▫put back on queue▫task with earliest deadline is chosen from
queued but not running tasks.•A problem with this approach
▫Task’s deadline may already pass before it is chosen for execution.
▫Indication that the task needs CPU time ahead of all later deadlines.
9
Guruprasad Aphale.
Task Insertion• Has only one global queue which contains
sub-queues• Inserts task into relevant queue
▫Each insertion takes O(1) time as inserting into doubly linked list.
• Check if new task can run on any idle CPU or can preempt any other running low priority task.▫Don’t wait for schedule to be called if a cpu is
idle or low priority task is running• Worst case time O(M), where M is number of
CPUs.
10
Guruprasad Aphale.
0 0 01 0 1 0Bitmap
Array of sub queue heads
1-100 RT tasks
SCHED_NORMALSCHED_ISO SCHED_IDLEPRIO
t1
ti
tj
tn
Insertion
tn+1
New Task to be inserted. SCHED_ISO
1
tn+2
New Task to be inserted. SCHED_RR (3rd Sub-queue)
11
Guruprasad Aphale.
Task Selection
•Check bitmaps to find first set bit.• If the first bit corresponds to a RT sub
queue, select first task in the queue• If no bit set in RT queue, then search for
next task based on EVDF and suitable CPU affinity.▫If a task with expired deadline found, return
that task.• In most cases, entire queue lookup is not
required.
12
Guruprasad Aphale.
0 0 00 0 1 0Bitmap
Array of sub queue heads
1-100 RT tasks
SCHED_NORMALSCHED_ISO SCHED_IDLEPRIO
d=7
d =5
Selection
Task removed from queue and assigned to cpu
13
Guruprasad Aphale.
Locking• Only one queue, thus only one lock protecting
the process data• So for every operation that modifies the
queue, global lock has to be acquired.• Once a task is assigned to a CPU, CPU
removes it from queue and creates a local copy of task’s data structure. Thus updates are lockless
• Max number of tasks in global queue = number of tasks remaining – number of CPUs + 1
14
Guruprasad Aphale.
Design Decisions•Single Run Queue ( O(n) lookup)
▫Refers to queued but not running processes▫Absence of multiple queues results in no
complex interactions among them
•Scheduling Decisions based only on CPU Usage ▫No sleep time taken into consideration
during scheduling decisions▫No interactivity estimator present
15
Guruprasad Aphale.
Design Decisions
• Interactivity in BFS▫The tasks that are waking up try to preempt
same priority tasks.▫Even if they do not preempt, the waiting
time for a task is still bounded(because of rr_interval), thus it is scheduled within a timeframe.
16
Guruprasad Aphale.
Performance Evaluation [3]
Test CFS BFS
Timed Apache Compilation
131.48 Seconds 122.76 Seconds
Timed PHP Compilation
197.90 Seconds 197.21 Seconds
7-Zip Compression 2180.66 MIPS 2104.00 MIPS
Apache Benchmark 1902.77 Requests Per Second
3140.67 Requests Per Seconds
Threaded I/O Tester (Read Filesize Per Thread : 16 MB, Thread Count : 8)
14.66 ms 16.00 ms
System ConfigurationDual-core Intel Atom 330 CPU with Hyper-Threading clocked at 2.10GHz
Good for Interactive Desktops?People have been doing a lot of benchmarking. But the results are not conclusive.
17
Guruprasad Aphale.
Thank You
Questions???
18
Guruprasad Aphale.
References
1. http://www.linux-magazine.com/Online/News/Con-Kolivas-Introduces-New-BFS-Scheduler
2. http://www.ibm.com/developerworks/linux/library/l-cfs/
3. http://www.phoronix.com/scan.php?page=article&item=bfs_scheduler_benchmarks&num=1
19
Guruprasad Aphale.
Global Queue Structure• Contains 103 subqueues
▫100 RT queues▫SCHED_ISO queue▫SCHED_NORMAL queue▫SCHED_IDLEPRIO queue▫Each sub queue is a doubly linked list.
• While setting up a new task, a bitmap of running priorities is set.
• Each bit corresponds to a sub queue inside global queue.
• This shows which priorities have waiting tasks.
20
Guruprasad Aphale.
0 0 01 0 1 0Bitmap
Array of sub queue heads
1-100 RT tasks
SCHED_NORMALSCHED_ISO SCHED_IDLEPRIO
t1
ti
tj
tn
Task Sub queues
Global Run Queue Structure (Only Queues)
21
Guruprasad Aphale.