Chapter 4 Interrupts and
Exceptions
Chapter 4 Chapter 4 Interrupts and Interrupts and
ExceptionsExceptionsHsungHsung--Pin ChangPin Chang
Department of Computer ScienceDepartment of Computer ScienceNational Chung Hsing UniversityNational Chung Hsing University
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Outline• The role of Interrupt Signals• Interrupts and Exceptions• Nested Execution of Exception and Interrupt
Handlers• Initializing the Interrupt Descriptor Table• Exceptions Handling• Interrupt Handling• Softirqs, Tasklet, and Bottom Halves• Returning form Interrupts and Exceptions
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Introduction• Interrupts: electrical signals generated by
hardware circuits both inside and outside the CPU chip
• Interrupts– Synchronous
• Produced by CPU control unit• Called synchronous because the control unit issues
them only after terminating the execution of an instruction
– Asynchronous• Generated by other devices at arbitrary times
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Introduction (Cont.)• Intel manuals designate synchronous and
asynchronous interrupts as exceptions and interrupts, respectively
• Interrupt are issued by timer and I/O devices
• Exceptions are cause by – Programming errors: e.g., divided by zero– Or anomalous conditions that must be handled
by the kernel• Page Fault or a request (via an int instruction) for a
kernel service
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Introduction (Cont.)• There is a key difference between
interrupt handling and process switching– The code executed by an interrupt or by
and exception handler is not a process– It is a kernel control path that runs on
behalf of the same process
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Introduction (Cont.)• Interrupt handling is a sensitive task
performed by kernel since– Interrupts can come at any time, when
the kernel may want to finish something• Divide an interrupt handing into two parts
– Top half: the kernel executes right away– Bottom half: left for later
» Kernel keep a queue pointing to all the functions that represent bottom halves
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Introduction (Cont.)– Interrupts can come at any time, the
kernel might be handling one of them while another one occurs• Nested interrupts
– Some critical regions exists inside the kernel code where interrupts must be disable• Such critical regions must be limited as
much as possible
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Interrupts and Exceptions
• Intel classifies interrupts and exceptions as follows– Interrupts– exceptions
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Interrupts• Interrupts
– Maskable interrupts: all Interrupt Requests issues by I/O devices give rise to maskable interrupts.
– Nonmaskable interrupts• Always recognized by the CPU
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Exceptions (Cont.)• Exceptions
– Processor-detected exceptions• Fault• Trap• Abort
– Programmed exceptions; also called software interrupts
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Exceptions• Exceptions
– Processor-detected exceptions: divided into three groups depending on the value of eip register saved on the Kernel Mode stack when exception occurs• Fault
– Can generally be corrected; once corrected, program can restart continuously
– eip is the address of the instruction that cause the fault
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Interrupts and Exceptions (Cont.)• Traps
– Can also continue to execute after handling the trap
– eip is the address of the instruction that is executed after the one that caused the trap
– Main use of trap is for debugging purpose• Aborts
– A serious error occurred and the affected process is terminated
– May be unable to store in the eip the precise location of the instruction causing the exception
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Exceptions– Programmed exceptions
• Occur at the request of the programmer by int and int3 instructions
• Handled by the control unit as traps• Also called software interrupts
– Implement system calls– Notify a debugger of a specific event
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Interrupts and Exceptions
• Each interrupt or exception is identified by a number from 0 to 255– Intel calls this 8-bit number a vector
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
IRQs and Interrupts• Interrupt ReQuest (IRQ)
– The output line for each devices to issue interrupt requests
– Connect to the input pins of Interrupt Controller
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Interrupt Controller• Monitor IRQ lines for raised signals• If a raised signal occurs
– Convert it into a corresponding vector– Store the vector in an I/O port– Send the signal to the processor INTR pin; that
is, issue an interrupt– Wait until the CPU acknowledge by writing the
I/O port and then clear the INTR line
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
IRQs and Interrupts• Each IRQ line can be selectively
disabled– Disabled interrupts are not lost– The programmable interrupt controller
sends them to the CPU as soon as they are enabled again
– Used to process IRQs of the same type serially
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
IRQs and Interrupts (Cont.)
• To global masking/unmasking of maskable interupts– cli and sti instructions that clear/set
the IF flag of the EFLAGS register
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Exceptions• 80x86 offers roughly 20 different
exceptions• Kernel must provide a dedicated
exception handler for each exception type– Table 4-1– Exception handler usually sends a signal
to the process that caused the exception
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Exceptions (Cont.)• (Vector, Name, Type) of each
exception• 0, “Divide error” (fault)• 1, “Debug” (trap or fault)• 2, Not used• 3, “Breakpoint” (trap)
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Exceptions (Cont.)• 4, “Overflow” (trap)• 5, “Bounds check” (fault)• 6, “Invalid opcode” (fault)• 7, “Device not available” (fault)• 8, “Double fault” (abort)• 9, “Coprocessor segment overrun” (abort)
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Exceptions (Cont.)• 10, “Invalid TSS” (fault)• 11, “Segment not present” (fault)• 12, “Stack segment” (fault)• 13, “General protection” (fault)• 14, “Page Fault” (fault)• 15, Reserved by Intel
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Exceptions (Cont.)• 16, “Floating-point error” (fault)• 17, “Alignment check” (fault)• 18, “Machine check” (abort)• 19, “SIMD floating point” (fault)• 20~31: reserved by Intel for future
development
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Interrupt Descriptor Table
• Interrupt Descriptor Table (IDT)– Associate each interrupt or exception
vector with the address of the corresponding interrupt or exception handler
– Initialized in the System Startup process
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Interrupt Descriptor Table (Cont.)
• Each IDT entry may correspond to three different types of descriptors– Task gate
• Include the TSS selector of the process that must replace the current one when an interrupt signal occurs
• Linux does not use task gates
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Interrupt Descriptor Table (Cont.)
– Interrupt gate• Include the Segment Selector and the
offset inside the segment of an interrupt or exception handler
• While transferring control to the proper segment, CPU clear IF flag to disable maskable interrupt
– Trap gate• Similar to an interrupt gate• But CPU does not clear IF flag while
transferring control
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Interrupt Descriptor Table (Cont.)
• The idtr register specifies both the IDT base physical address and its limit (maximum length)– Initialized before enabling interrupt by
lidt instruction
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Hardware Handling of Interrupts and Exceptions
• When an interrupt or exception occurs, the control unit of CPU performs the following– Determine the vector i– Read the ith entry of IDT by idtr
register– Get the GDT’s base address by gdtr
register and read the Segment Descriptor identified by the segment selector of IDT entry
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Hardware Handling of Interrupts and Exceptions (Cont.)
– Make sure that the interrupt was issued by an authorized privilege level
– Check whether a change of privilege level is needed
– If a fault occurs, load cs and eip with the logical address of the instruction that caused the exception• Originally, cs and eip contains the next
instruction of the faulted instruction
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Hardware Handling of Interrupts and Exceptions (Cont.)
– Saves the contents of eflag, cs, and eipin the stack
– Load cs and eip with Segment Selector and the Offset field of the Gate Descriptor of the ith entry of IDT• Define the logical address of the first
instruction of interrupt or exception handler
• Equal to the jump to the first instruction of interrupt or exception handler
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Hardware Handling of Interrupts and Exceptions (Cont.)
• After the interrupt or exception is processed– Handler issue the iret instruction
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Nested Execution of Exception and Interrupt Handlers
• When handling an interrupt or exception– Kernel begins a new kernel control path
• Linux does not allow process switching while the CPU is executing a kernel control path with an interrupt– However, Linux allow kernel control path be
nested– Nested execution of interrupts or exceptions
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Nested Execution of Exception and Interrupt Handlers (Cont.)
• At most two kernel control path associated with exceptions– First one is a system call invocation– Second one is a Page Fault
• This is because that– Assume that kernel is bug free and no
exception occurs in kernel– However, Page Fault may occur in Kernel Mode
• Happen when addressing a page that belongs to process’s address space but is not currently in RAM
– Page Fault handler never give rise to further exception
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Nested Execution of Exception and Interrupt Handlers (Cont.)
• An interrupt handler may preempt both other interrupt handlers and exceptions
• But an exception handler never preempts an interrupt handler
• The only exception that can be triggered in Kernel Mode is Page Fault– But, actually, no Page Fault occurs in a interrupt
handler
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Nested Execution of Exception and Interrupt Handlers (Cont.)
• Why Linux interleaves kernel control path– Improve the throughput of interrupt
controller and device controller– To implement an interrupt model without
priority level• Each interrupt handler may be deferred by
another one, no need to establish priorities among devices
• Simplify the kernel code and improve its portability
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Initializing the Interrupt Descriptor Table
• Before kernel enabling the interrupt, it must – Load the address of IDT table to the idtr
register – Initialize all the entries of that table
• Some interrupts cannot be issued by a user process– Set the DPL (Descriptor Privilege Level) field to 0
• In a few cases, a User Mode process must be able to issue a programmed exception– Set the DPL to 3
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Initializing the Interrupt Descriptor Table (Cont.)
• Intel provides three type of interrupt descriptors– Task Gate– Interrupt Gate– Trap Gate Descriptors
• Linux does not use Task Gate descriptor and only uses Interrupt and Trap Gate Descriptors
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Initializing the Interrupt Descriptor Table (Cont.)
• Linux classifies Interrupt and Trap Gate Descriptor as follows– Interrupt gate
• Cannot be accessed by a User Mode process• All Linux interrupt are activated by interrupt gates
– System gate• An Intel trap gate that can be accessed by a User
Mode process (DLP = 3)• Four Linux exception handler associated with vector
3, 4, 5, 128 are activated by system gates
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Initializing the Interrupt Descriptor Table (Cont.)
– Trap gate• An Intel trap gate that cannot be accessed
by a User Mode process (DLP = 0)• Most Linux exception handlers are activated
by trap gates
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Exception Handling (Cont.)• Most exceptions are handled simply
by sending a Unix signal to the process that caused the exception– To notify it of an anomalous condition
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Exception Handling• Exception handler
– Save the contents of most register in the Kernel Mode stack
– Handle the exception by means of a high-level C function• Store the hardware error code, if any, and exception
vector in the process descriptor of current• Send a suitable signal to that process (Table 4-1)
– Exit from the handler by means of the ret_from_exception() function
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Exception Handling (Cont.)• The current process take care of the
signal right after the termination of exception handler– Handled either in User Mode by the
process’s own signal handler, if it exists– Or in Kernel Mode
• Usually kill the process
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Interrupt Handling• Interrupt handling depends on the
type of interrupt– I/O interrupts– Timer interrupts
• Discuss in Chapter 6– Interprocessor interrupts
• Discuss in the later section
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
I/O Interrupt Handling• I/O interrupt handler must be
flexible enough to service several devices at the same time– In the PCI architecture, several devices
may share the same IRQ line
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
I/O Interrupt Handling (Cont.)
• How to achieve flexibility for interrupt handler– IRQ sharing: the interrupt handler executes
several interrupt service routines (ISRs)– IRQ dynamic allocation: an IRQ line is associated
with a device at the last possible moment• An IRQ line of the floppy device is allocated when a
user access the floppy disk device• The same IRQ vector may be used, not at the same
time, by several devices even if they cannot share the IRQ line.
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
I/O Interrupt Handling (Cont.)
• Issue to interrupt handling– Long noncritical operations must be deferred
or the signal on the same IRQ lines are temporarily ignored during an interrupt handling
– Cannot perform any blocking procedure such as I/O operation• Or the TASK_RUNNING state will be changed to
TASK_INTERRUPTIBLE (or TASK_UNINTERRUPTIBLE) and the system is freeze
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
I/O Interrupt Handling (Cont.)
• Linux divides the interrupt handling into three classes– Critical: executed within the handler
immediately with maskable interrupt disabled• Acknowledge an interrupt to the PIC• Reprogramming the PIC or device controller• Update data structure accessed by both
device and processor
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
I/O Interrupt Handling (Cont.)
– Noncritical: executed by the interrupt handling immediately with interrupt enable• Update DS that are accessed only by the
processor– Noncritical deferrable: performed by
separate function called Bottom Halves• Copy a buffer’s contents into the address
space of some process
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
I/O Interrupt Handling (Cont.)
• All interrupt handlers perform the same four basic actions– Save the IRQ value and the registers
contents in the Kernel Mode stack– Send an acknowledgment to the PIC
that is servicing the IRQ line,• Allow it to issue further interrupts
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
I/O Interrupt Handling (Cont.)
– Execute the interrupt service routines (ISRs) associated with all the devices that share the IRQ
– Terminate by jumping to the ret_from_intr() address
• Figure 4-3. I/O interrupt handling
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
I/O Vectors• Table 4-2
– Physical IRQs may be assigned any vector in the range 32~238
– But Linux uses vector 128 to implement system calls
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
I/O Vectors (Cont.)• Furthermore, IBM-compatible PC
requires some devices statically connected to specific IRQ lines– Timer device must be connected to IRQ0 – Slave 8259A PIC is connected to IRQ2– External mathematical coprocessor must
be connected to IRQ13
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
I/O Vectors (Cont.)• Table 4-3
– An example of IRQ assignment to I/O devices
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
IRQ Data Structures• Fig. 4-4• irq_desc includes many irq_desc_t
descriptors which has following fields– Status: describe the IRQ line status
• See the following next slide (FLAGS)– Handler: point to the hw_interrupt_type
descriptor that identifies the PIC circuit servicing the IRQ line• There are many different type of PIC
circuits • Mentioned later
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
IRQ Data Structures (Cont.)
– Action: identifies the interrupt service routine (ISR) to be invoked • Point to the first element of the list of irqaction
– Depth• 0: the IRQ line is enabled• Positive: the IRQ line is disabled at lease once
– Lock: a spin lock to serialize the access to the IRQ descriptor in a SMP system
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Table 4-4. Flags Describing the IRQ line status
• IRQ_INPROGRESS– A handler for the IRQ is being executed
• IRQ_DISABLED– The IRQ line is disabled by a driver
• IRQ_PENDING– AN IRQ has occurred on the line and has
been acknowledged to the PIC, but it has not yet been serviced by the kernel
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Table 4-4. Flags Describing the IRQ line status (Cont.)
• IRQ_REPLAY– The IRQ line is disabled but previous IRQ
occurrence has not yet been acknowledged• IRQ_AUTODETECT• IRQ_WAITING• IRQ_LEVEL• IRQ_MASKED • IRQ_PER_CUP
– Skip
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
IRQ Data Structures (Cont.)
• Suppose we use 8259A PIC, then Handler in irq_desc_t descriptor– Point to the i8259A_irq_type variable that has
following fields• “XP-PIC”: PIC name• startup_8259A_irq: startup an IRQ line of the chip• shutdown_8259A_irq: shutdown an IRQ line• enable_8259A_irq: enable an IRQ line• disable_8259A_irq: disable an IRQ line• mask_and_ack_8259A: acknowledge the IRQ line to
8259A and disable the IRQ line in uniprocessor system• end_8259A_irq: invoke when the interrupt handler for
the IRQ line terminates• NULL
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
IRQ Data Structures (Cont.)
• Note that– Except PIC name, all other entries are
function pointers
– In 8259A• startup_8259A_irq = enable_8259A_irq• shutdown_8259A_irq = disable_8259A_irq
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
IRQ Data Structures (Cont.)
• To allow multiple devices share a single IRQ– Kernel maintain irqaction descriptor,
each descriptor refer to a specific hardware device and a specific interrupt
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
IRQ Data Structures (Cont.)
• irqaction– Handler: point to the ISR – Flags: describe the relationships between the
IRQ line and I/O device– Name: the name of the I/O device– dev_id: a private field for the I/O device
• E.g., the major and minor number– Next: point to the next element of irqaction
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Saving the Registers for the Interrupt Handler
• The interrupt handler for IRQn is named IRQn_interrupt– Its address is included in the interrupt
gate of IDT entry
• Each interrupt handler will first save registers
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Saving the Registers for the Interrupt Handler (Cont.)
IRQn_interrupt:pushl $n-256 ; save the IRQ number minus 256jmp commom_interrupt
common interrupt:SAVE_ALL ; see the following slidecall do_IRQjmp $ret_from_intr
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Saving the Registers for the Interrupt Handler (Cont.)
• SAVE_ALL– Save all the CPU registers that may be
used by the interrupt handler on the stack
– Except for eflags, cs, eip, ss and esp• Already saved automatically by the control
unit
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
do_IRQ() function• irq_desc[irq].handler->ack(irq)
– Invoke the ack method of IRQ descriptor (mask_and_ack_8258A())
• Set irq_desc[irq].status to proper value• Handle_IRQ_event()
– Call the irq_desc[irq].action– That is, invoke the interrupt service routines
sequentially• irq_desc[irq].handler->end(irq)
– Invoke the end method of IRQ descriptor (end_8259A_irq())
• do_softirq() if necessary– Execute the deferrable kernel functions
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Interrupt Service Routines
handle_IRQ_event() function includes the following code do {
action->handler(irq, action->dev_id, regs)action = action->next;
} while (action)• The action->handler will invoke the ISR with three
parameters– irq: the IRQ number– dev_id: the device identifier– regs: A pointer to the Kernel Mode area containing the
registers saved right after the interrupt occurred
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Dynamic Allocation of IRQ Lines
• The same IRQ lines can be used by several devices even these devices do not allow IRQ sharing– Serialize the activation of the hardware
devices so that just one owns the IRQ line at a time
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Dynamic Allocation of IRQ Lines (Cont.)
• Before activating a device, invoke request_irq() function– Create and initialize a new irqaction descriptor
• Then, invoke the setup_irq() function– Insert the descriptor into the proper IRQ list
• Finally, when finished, invoke the free_irq() function – Remove the descriptor from the IRQ list and
release the memory area
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs, Tasklets, and Bottom Halves
• Linux uses three kinds of deferrable and interruptible kernel functions (in short, deferrable function)– Softirqs– Tasklets– Bottom halves
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs, Tasklets, and Bottom Halves (Cont.)
• No softirq can be interrupted to run on another softirq on the same CPU– The same rule holds for tasklets and bottom
halves built on top of sofirqs• Deferrable functions must be executed
serially– Any deferrable function cannot be interleaved
with other deferrable function on the same CPU
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs, Tasklets, and Bottom Halves (Cont.)
• Four kinds of operations can be performed on deferrable functions– Initialization:
• define a new deferrable function– Activation
• Mark a deferrable function as “pending”– Masking
• Selectively disables a deferrable function, introduced in Chapter 5
– Execution• Executes a pending deferrable function together with all
other pending deferrable functions of the same type
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs, Tasklets, and Bottom Halves (Cont.)
• Activation and execution of a deferrable function are bound together– A deferrable function that has been
activated by a given CPU must be executed on the same CPU
– To make better use of CPU cache
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs, Tasklets, and Bottom Halves (Cont.)
• Difference
Bottom halves cannot run concurrently on several CPUs
NoBottom half
Tasklets of different type can run concurrently on several CUPs, but tasklets of the same type cannot
YesTasklet
Softirq of the same type can on concurrent on several CPUs
NoSoftirq
ConcurrencyDynamicAllocation
Deferrable function
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs, Tasklets, and Bottom Halves (Cont.)
• Softirqs and bottom halves are statically allocated– Defined at compiler time
• Tasklets can be allocated and initialized at runtime– When loading a kernel module
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs, Tasklets, and Bottom Halves (Cont.)
• Softirqs can be executed concurrently on several CPUs, even if they are of the same type
• Furthermore, softirqs are re-entrant functions and must explicitly protect their DS with spin locks
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs, Tasklets, and Bottom Halves (Cont.)
• Tasklet is always serialized with respect to itself– A tasklet cannot be executed by more
than two CPUs at the same time– However, different tasklets can be
executed concurrently on several CPUs– By tasklet’s serialization ability, device
driver development is must easier• Tasklet function needs not to be re-entrant
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs, Tasklets, and Bottom Halves (Cont.)
• Bottom halves are globally serialized– When a bottom half is executed on a CPU, no
other CPUs can execute any bottom half, even different type
– Degrade the performance of the Linux kernel on MP
– Thus, bottom halves is only for compatibility and expect to be disappeared in the future
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs• Linux 2.4 uses four kinds of softirqs
Handles tasklets3TASKLET_SOFTIRQ
Receives packets from network cards
2NET_RX_SOFTIRQ
Tranmits packets to network cards
2NET_TX_SOFTIRQ
Handles high-priority tasklets and bottom halves
0HI_SOFTIRQ
DescriptionIndex(priority)
Softirq
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs (Cont.)• DS1: softirq_vec[] array with each
element of type softirq_action, which has two fields– A pointer to softirq function– A pointer to a generic data structure
that may be needed by the softirqfunction
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs (Cont.)• DS2: irq_stat[] array, each element has
some following fields– __sofirq_pending: point to the pending
softirq’s softirq_action structure– __local_bh_count: enable (=0) or disable (> 0)
the execution of the softirqs– __ksoftirqd_task: store the process
descriptor address of a ksoftirqd_CPUn kernel thread• Devoted to the execution of deferrable functions
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs (Cont.)• Initialization: open_softirq()
– Initialize the proper entry of softirq_evc array• Activation: __cpu_rase_softirq macro
– Set the softirq_pending bit• Mask: local_bh_disable marco
– Increment the __local_bh_count field • Execution
– do_softirq() function
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Softirqs (Cont.)• Checking for pending softirqs
– When the local_bh_enable macro re-enables the softirqs
– When the do_IRQ function finished handling an I/O interrupt
– When the smp_apic_timer_interrupt() function finishes handling a local timer interrupt
• Chapter 6– When one of the ksoftirqd_CPUn kernel thread is
awaked– When a packet is received on a network interface card
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
do_softirq()• Return if local_irq_count <> 0
– Handling an interrupt handler• Return if local_bh_count <> 0
– Disable all deferrable functions• Executes, if exists, all pending softirqs of the
same type– All pending sofirqs activated during the execution of
softirq functions are also executes• Wakeup ksoftirqd_CPUn kernel thread if a
softirq is activated during the handling of do_sofirq() function
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
The softirq kernel thread
• Check if any pending softirq and, if necessary, invokes do_sofirq()
• Why we introduce the ksoftirq_CPUnkernel thread? – Consider a case the packet flooding on a
NIC and thus softirqs may be activated at very high frequency
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
The softirq kernel thread (Cont.)
• Two approaches– Ignore new sofirqs that occur while
do_sofirq() is running• Softirq latency time is unacceptable for
networking developers– Continuously rechecking for pending
softirqs• do-softirq() function never returns and the
User Mode programs are virtually stopped
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
The softirq kernel thread (Cont.)
• Thus, the softirq kernel thread has low priority– User programs have a chance to run
• But, if the machine is idle, the pending softirqs are executed quickly
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Tasklets• Tasklets are the preferred way to
implement deferrable function in I/O
• Tasklets are build on top of softirqs– HI_SOFTIRQ– TASKET_SOFTIRQ
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Tasklets (Cont.)• DS1:
– tasklet_vec[] and tasklet_hi_vec[]– Each element consists of a pointer to a
list of tasklet_descriptor
• DS2– Tasklet descriptors: tasklet_struct
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Tasklets (Cont.)• tasklet_struct
– next: pointer to the next descriptor in the list– state: status of the tasklet
• TASKLET_STATE_SCHED: the tasklet is pending• TASKLET_STATE_RUN: the tasklet is being
executed– count: lock counter– func: pointer to the tasklet function– data: an unsigned long integer that may be used
by the tasklet function
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Tasklets (Cont.)• Initialization: tasklet_init()
– Initialize the relative DS• Mask: tasklet_disable_nosync() or task_disable()
– Increase the count filed of tasklet descriptor• Activation: tasklet_schedule() or
tasklet_hi_schedule()– Add the tasklet descriptor to the list pointed by
tasklet_vec[] or tasklet_hi_vec[]– Invoke cpu_raise_softirq() to activate either
TASKLET_SOFTIRQ softirq or HI_SOFTIRQ softirq
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Tasklets (Cont.)• Execution: via do_softirq() that
executes the– HI_SOFTIRQ softirq function:
tasklet_hi_action()– TASKLET_SOFTIRQ softirq function:
tasklet_action()
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Bottom Halves• A bottom halves is essentially a high-
priority tasklet that cannot be executed concurrently with any other bottom half– Even if it is of a different type on
another CPU– global_bh_spin_lock is used to
guarantee
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Bottom Halves (Cont.)• DS: bh_base[]
– Array of pointer to bottom halves
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Bottom Halves (Cont.)• Initialization: init_bh(n, routine)
– Insert the routine address as the nth entry of bh_base
• Activation: mark_bh()– Call the tasklet_hi_schedule() since
bottom halves are high-priority tasklets• Execution: bh_action()
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Extending a bottom half• In addition to handle interrupt,
bottom halves can be stretched as– To allow not only a function that
services an interrupt, but also a generic kernel function to be executed as a bottom half
– To allow several kernel function, instead of a single one, to be associated with a bottom half
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Extending a bottom half (Cont.)
• Groups of functions are represented by a task queue– A list of tq_struct structure which has
following fields• List: for linked list• Sync: to prevent multiple activations• Routine: function to call• Data: argument for the function
• I/O device drivers use task queues to require the execution of several functions when a specific interrupt occurs– See in Chapter 13
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Returning from Interrupts and Exceptions
• Issues must be considered before termination of interrupts and exceptions– Number of kernel control paths being
concurrently executed• If just one, CPU must switch back to the CPU
– Pending process switch request• need_resched = 1? • Otherwise, return to the current process
– Pending signals• If exits, must be handled
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Returning from Interrupts and Exceptions (Cont.)
• ret_from_exception()– Terminates all exceptions except the 0x80
ones (system calls)• ret_from_intr()
– Terminates interrupt handlers• ret_from_sys_call()
– Terminates system calls (0x80 programmed exceptions)
• ret_from_fork()– Terminates the fork, vfork() or clone() system
calls
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com
Returning from Interrupts and Exceptions (Cont.)
• Both ret_from_intr() and ret_from_exception() check if they are in nested kernel control path
• All of the ret_from_intr(), ret_from_exception(), ret_from_sys_call(), ret_from_fork() will– check if need reschedule – Check if pending signal exists
PDF created with FinePrint pdfFactory Pro trial version www.pdffactory.com