Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | anis-shields |
View: | 303 times |
Download: | 2 times |
4.1
Introduction to Threads
Overview
Multithreading Models
Thread Libraries
Threading Issues
Operating System Examples
Windows XP Threads
Linux Threads
4.2
Threads
A Thread is just a sequence of instructions to execute
Threads share the same memory space as other threads in the same application – so they automatically share data and variables.
Threads can run on different processor cores on a multicore processor – this makes applications faster and more responsive
Even on a single core processor threads make an application more responsive – if one thread stops waiting for I/O, other threads can still run
Processes have a unique virtual memory address space and they take a lot longer for the OS to switch between than threads. Sharing data requires additional overhead and steps – so they have a lot more overhead than threads in many applications. Most applications have one process with several threads.
In C/C++, a thread typically runs the code in a C/C++ function and a special API call starts up a new thread running that function.
4.4
Benefits of Threads
Responsiveness
Applications can run up to N times faster on an N core processor
Resource Sharing
Economy
Scalability
4.5
Multicore Programming
Applications only run on one processor core - unless they use multiple threads
Multicore systems are putting more pressure on programmers to use threads, multithreaded application challenges include:
Dividing activities
Balancing the Computational Load
Data splitting
Data dependency
Testing and debugging
4.7
Parallel Execution on a Multicore System
OS can time slice the four Threads T1…T4 on two processor cores. Two threads can run in parallel on different cores. Application could run up to twice as fast. Without threads, an application can run on only one core!
4.8
User Threads
Thread management done by a user-level threads library
Three primary thread libraries:
POSIX Pthreads
Win32 threads
Java and C# threads
4.9
Thread Libraries
Thread library provides programmer with API for creating and managing threads
Two primary ways of implementing
Library entirely in user space
Kernel-level library supported by the OS
4.10
Pthreads
A POSIX standard (IEEE 1003.1c) API for thread creation and synchronization
API specifies behavior of the thread library, implementation is up to development of the library
Common in UNIX operating systems (Solaris, Linux, Mac OS X)
Can also be added to Windows by installing the optional Pthreads library
4.11
Java and C# Threads
Thread support is built into these newer languages with keywords
Java threads are managed by the JVM
C# thread support is in .Net Framework (the C# JVM)
Typically implemented using the threads model provided by underlying OS
Java and C# threads may be created by:
Extending Thread class
Implementing the Runnable interface
4.12
Threading Issues
Semantics of fork() and exec() system calls
Thread cancellation of target thread
Asynchronous or deferred
Signal handling
Thread pools
Thread-specific data
Scheduler activations
4.13
Thread Cancellation
Terminating a thread before it has finished
Two general approaches:
Asynchronous cancellation terminates the target thread immediately
Deferred cancellation allows the target thread to periodically check if it should be cancelled
4.14
Signal Handling
Signals are used in UNIX systems to notify a process that a particular event has occurred
A signal handler is used to process signals
1. Signal is generated by particular event
2. Signal is delivered to a process
3. Signal is handled
Options:
Deliver the signal to the thread to which the signal applies
Deliver the signal to every thread in the process
Deliver the signal to certain threads in the process
Assign a specific thread to receive all signals for the process
4.15
Thread Pools
Create a number of threads in a pool where they await work
Advantages:
Usually slightly faster to service a request with an existing thread than create a new thread
Allows the number of threads in the application(s) to be bound to the size of the pool
4.16
Windows Threads
Implements the one-to-one mapping, kernel-level
Each thread contains
A thread id
Register set
Separate user and kernel stacks
Private data storage area
The register set, stacks, and private storage area are known as the context of the threads
4.17
Linux Threads
Linux refers to them as tasks rather than threads
Thread creation is done through clone() system call
clone() allows a child task to share the address space of the parent task (process)
Background on the need for Synchronization
• Threads may need to wait for other threads to finish an operation
• Additionally concurrent access to shared data with threads may result in data inconsistency (i.e., incorrect values)
• Maintaining data consistency requires mechanisms to ensure the orderly execution of cooperating processes (or threads)
Example Problem
• Suppose two threads share a common buffer array. The producer put items in the buffer and the consumer removes them.
• A solution to a two thread consumer-producer problem that fills all the buffer space has an integer count that keeps track of the number of full buffers. Initially, count is set to 0. It is incremented by the producer after it produces a new buffer and is decremented by the consumer after it consumes a buffer.
Producer while (true) { /* produce an item and put in
nextProduced */ while (count == BUFFER_SIZE)
; // do nothing buffer [in] = nextProduced; in = (in + 1) % BUFFER_SIZE; count++;
}
Consumer while (true) {
while (count == 0) ; // do nothing nextConsumed = buffer[out]; out = (out + 1) % BUFFER_SIZE;
count--;
// consume the item in nextConsumed}
Critical Section
• The code segments that read and write global shared data between threads or processes is called a “critical section”
• Possible race condition bugs on global variable values – example will follow
• OS Synchronization API used to solve this• Must be careful and use OS synchronization
primitives to control access to a critical section or hidden bugs will appear in code
Race Condition on Count• count++ could be implemented as
register1 = count register1 = register1 + 1 count = register1
• count-- could be implemented as
register2 = count register2 = register2 - 1 count = register2
• Consider this execution interleaving with “count = 5” initially:
S0: producer executes register1 = count {register1 = 5}S1: producer executes register1 = register1 + 1 {register1 = 6} S2: consumer executes register2 = count {register2 = 5} S3: consumer executes register2 = register2 - 1 {register2 = 4} S4: producer executes count = register1 {count = 6 } S5: consumer executes count = register2 {count = 4}
Need an Atomic Operation
• Count++ and Count-- code must run to end before switching to other thread to avoid bugs
• Atomic operation here means a basic operation which cannot be stopped or interrupted in the middle to switch to another thread
• Race conditions will occur faster on systems with multiple processors since threads are running in parallel
Solution to Critical-Section Problem1. Mutual Exclusion (Mutex) - If process Pi is executing in its critical
section, then no other processes can be executing in their critical sections
2. Progress - If no process is executing in its critical section and there exist some processes that wish to enter their critical section, then the selection of the processes that will enter the critical section next cannot be postponed indefinitely
3. Bounded Waiting - A bound must exist on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is grantedAssume that each process executes at a nonzero speed No assumption concerning relative speed of the N processes
Solution to Critical-section Problem Using Mutex Locks
do { acquire lock
critical section release lock
remainder section } while (TRUE);
Deadlock and Starvation• Deadlock – two or more processes or threads are waiting indefinitely for
an event that can be caused by only one of the waiting processes• Let S and Q be two semaphores initialized to 1 (i.e. a mutual exclusion
lock) P0 P1
wait (S); wait (Q); wait (Q); wait (S);
. .
. .
. . signal (S); signal (Q); signal (Q); signal (S);
• Starvation – indefinite blocking. A process may never be removed from the semaphore queue in which it is suspended
• Priority Inversion - Scheduling problem when lower-priority process holds a lock needed by higher-priority process. Might need to run lower –priority process first to continue. – messes up priority on processes
RTOS• Real Time Operating System (RTOS)• Used in systems that need a fast response time
to external events on the order of milliseconds• This is about 10-100X faster than PCs• The general purpose OS in a PC is optimized for
throughput and a fast graphical user interface – but at the expense of the Real Time response
Mbed RTOS & Threads
• Runs a 1ms time slice to switch between threads this is about 10-100X faster than PCs
• Memory is limited to around 8 threads – each thread needs its own stack and the RTOS also uses a fair chunk of RAM (32K). RAM is used for variables only. Nonvolatile Flash memory stores code and constants -there is (512K) of it, so it is typically not the issue.
MBED RTOS
• The mbed RTOS also provides some basic synchronization primitves:– Mutex Lock – used to lock and unlock access to
shared memory (variables) and I/O devices– On the mbed compiler, using the keyword volatile
will put the equivalent of a mutex lock on a simple built in global variable data type (but not arrays)
– Signals – can be used to send signals between threads
MBED RTOS
• Semaphores – a more advanced synchronization primitive than a mutex. Can count things, but also slower than a mutex.
• Thread::wait(x ms) – tells the RTOS scheduler to not run this thread again until x ms of time has passed. Useful to keep a thread from using too much processor time when it does not need it. Other threads run during the delay.
• Don’t use wait – use Thread::wait
Mbed RTOS• Free for ARM mbed users. Many RTOSes require a
license fee. Just need RTOS library in project and a new #include “rtos.h” after mbed.h include
• Documentation and code examples found in the mbed Handbook under “Real Time Operating System” click “mbed RTOS” link
• Free networking libraries are also available that use the RTOS for Internet of Things Devices (IoT)