+ All Categories
Home > Documents > 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling...

9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling...

Date post: 17-Jan-2016
Category:
Upload: emerald-murphy
View: 213 times
Download: 0 times
Share this document with a friend
42
9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John DeHart Washington University [email protected] http://www.arl.wustl.edu/~jdd
Transcript
Page 1: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 1WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

APIC Stalling problem Notes

with additional notes on Interrupts

John DeHart

Washington University

[email protected]

http://www.arl.wustl.edu/~jdd

Page 2: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 2WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Issue• There seems to be a bug in the system

– Right now only shows up on SPC-II

– In the past, has shown up on SPC-I• But this could be similar symptoms of different problems.

– No recollection of it ever showing up on end hosts.

– All these different systems have different timing

• We ran into this problem in preparing for and doing the WU 150th anniversary demo.

• Fred is having this problem in his kernel testing.

• JohnD is having this problem in his final SPC-II performance testing.

Page 3: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 3WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Issue (continued)• Symptoms:

– Transmit queue stalls for paced connections• Resuming connection as BE (then Paced) clears the queue most of the time

– sometimes it then stalls again and eventually we can not resume it.

– Also stalls for BE connections• Resuming as BE gets the data flowing again• actually resuming ANOTHER channel causes the stalled channel to resume

– This seems to imply a possible “global” pacer problem?

– When it stalls and the APIC runs out of descriptors, we do get an ERROR interrupt for the out of descriptors state.

• This seems to imply that the APIC and ICU are in a state such that they can generate an APIC interrupt to the CPU.

• If the APIC had generated an interrupt that had been “lost” the APIC and/or ICU would probably not be in a state that would allow another APIC interrupt to reach the CPU.

• Seems to be traffic rate related– as the traffic rate approaches the limit of what we can process the problem

is more likely to show itself• Seems to be SPC related

– some SPCs show the problem more readily than others

Page 4: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 4WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Tools Assembled• Monitoring GUI

• PCI Bus analyzer– setup for it save in

• jddlap:/C/SPC_II/PCI_Traces/SPC_II_PCI_Setup.stp

• /project/arl/jdd/SPC_II/PCI_Traces/SPC_II_PCI_Setup.stp

• SPCWatch– using APIC control cells dump portions of memory and APIC registers without going through

the kernel. Going through the kernel sometimes changes the state of the current problem by resuming a stalled xmit connection. Depending on how much memory is being dumped, this may take a long time (16 bytes of memory per APIC control cell).

– scripts using it are in:• /d/jdd/wu_arl/HARDWARE_TESTS/SPC_TEST_PCI/

– dumpAllMSRDescs: dumps ALL 64K APIC Descriptors– dumpMSRrxDescs: dumps all 8K Rx descriptors– dumpMSRtxDescs: dumps all 8K Tx descriptors– getTxConnAndChanStatusRegs: retrieves the Connection and Channel Status regs for a Tx chan.

• sencmd

• SPC:/usr/local/bin/datatest

• SPC:/usr/local/bin/readCounts

• Jammer

Page 5: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 5WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Useful Notes• APIC Sync Bits:

– 0: DONE_VALIDLINK (APIC is done, belongs to the Driver now)– 1: DONE_INVALIDLINK (should never happen for Tx)– 2: NOT_READY (Belongs to the Driver)– 3: READY (Belongs to the APIC)

• Kernel modified to support PCI Bus Analyzer– Bus Analyzer requires line card to be removed from SPC-II– SPC-II with no line card does not get a grant for sending data to the line card so the

FPGA Fifos fill up and drop cells. Ick.– Kernel modified to send external data to switch

• with this we can also monitor the output rate of the external data VCs.

• APIC Descriptor Address Ranges:– Index 0: Addr 0x1d17000 : Invalid descriptor– Index 0001 - 8192: Addr 0x1d17010 – 0x1d37000 : Rx Descs– Index 8193 – 16384: Addr 0x1d37010 – 0x1d57000 : Tx Descs

• APIC Registers of interest:– 0x518 : Interrupt Acknowledge Register– 0x530 : Notification Register– 0xD500CH08: TX Channel status register– 0xD500CHF0: TX Channel BE Resume register

Page 6: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 6WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Questions

• Does the driver handle multiple buffers chained together on receive properly?– It is possible for the last cell of a packet to get dropped

making the packet look like a long packet spanning multiple buffers.

• Are there any buffer start address concerns?– old notes on APIC bug which caused us to align buffers

on 48 and 56 byte boundaries

– This is the RX Sync bug (July 1999 Kits slides) which locks up the APIC and needs a reset to get going again. This does not sound like what is happening to us now.

• although, could this be what eventually happens after a few resumes when the SPC locks up?

Page 7: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 7WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Issue (continued)

• Suspects:– “Lost” interrupt

– APIC Hardware bug• interrupt handling

– timing between two instances of INTR signal being asserted.

• descriptor handling

• pacer

• flow control

• other?

– APIC driver bug• interrupt handling

• descriptor handling

• other?

– NetBSD Interrupt handling bug

– SPC-II FPGA flow control bug

Page 8: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 8WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Issue (continued)

• Plan of Attack:– Analyze apic driver code:

• compare MSR vs. end host driver code.

– Get details of descriptor chain when it stalls:• dump APIC descriptor chain as it exists in memory

• dump APIC current descriptor chain register for stalled channel

– monitor interrupt counts on SPC-II and compare to packet counts• vmstat –I

– Note what IRQs are assigned to what at boot time.

– Turn off SPC-II FPGA flow control to APIC• change VHDL

• rebuild bitfile

• re-program SPC-II FPGA

• retest

Page 9: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 9WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Additional Issue/Symptom

• We sometimes get into a state where:– We send an MSR command/control cell to a port

– The APIC does not register a cell arrival.

– Neither the OPP transmit cell counter nor the OPP drop cell counters on that port increment.

• Suspect: APIC or FPGA flow control issue

Page 10: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 10WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

SPC II FPGA ArchitectureSPC-II CLOCK DOMAINS

FPXLC Switch

APIC

PCI Bus Port

Port 1Port 0

SPC-II FPGA

4

1

5

2

3 6VPI[0]=1

VPI[0]=0

64<=VCI<=127???

VPI[0]=1VCI = 38

Reset

CBBG

D

EH

A

16/32

16/32

32

32

OSC

1616 16 16

Reset

Page 11: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 11WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

SPC FPGA Fifos

• FIFO 1: Large Sync Fifo: 512 Words: 36 cells• FIFO 2: Large Async Fifo: 512 Words: 36 cells• FIFO 3: Tiny Sync Fifo: 64 Words: 4 cells• FIFO 4: Tiny Sync Fifo: 64 Words: 4 cells• FIFO 5: Medium Async Fifo: 128 Words: 9 cells• FIFO 6: Medium Sync Fifo: 128 Words: 9 cells

Page 12: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 12WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Flow Control Test #1• Send data from Switch to SPC-II

– transit through APIC from Port 1 to Port 0– SPC-II is reset, no kernel running– No data crossing PCI bus– No descriptors/buffers used

• Overload 16 bit APIC interface– Send ~ 1.2 Gb/s– ~ 982 Mb/s goes through APIC– ~ 220 Mb/s is dropped in OPP CS0 buffer

• Turn data on/off repeatedly– no stall/hang-up– when data turned back on it continues to transit APIC

Page 13: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 13WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Flow Control Test #2A (AAL5Generator)• Send data from Switch to SPC-II

– Load kernel (JDD’s BE Debug Kernel) and process packets – Configure switch and routes so that two input ports (P1, P5) get a copy of the traffic to be

routed.– Configure the two input ports (P1, P5) routes so that they route the traffic to Egress port 0

• Overload APIC processing in Kernel on Port 0– send 60 Mb/s at each input port

• using AAL5Generator: smooth pacing at batch (8) of cells level– total of 120 Mb/s at output port– pkt sz = 1500 bytes (

• Kernel error messages:– RX CID (65 and 69) out of descriptors

• indicates: we are sending more data at the kernel than it can handle– Bad CRC

• indicates: cells are being dropped somewhere– either APIC or SPC-II FPGA. – Probably APIC, if it was FPGA, it would flow control switch– but we may not be sending enough for FC to back up all the way through OPP buffer.

• But no cells are dropped in OPP– indicates: SPC-II FPGA is not flow controlling switch

• System remains stable:– ran for several minutes in this state with no stall or hang.

• Increase rate to 80 Mb/s at each input port– system continues to remain stable– no drops in OPP

Page 14: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 14WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

dips in output are due to Kernel printing error msgs

Page 15: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 15WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Flow Control Test #2B (sendpkts)• Send data from Switch to SPC-II

– Load kernel (JDD’s BE Debug Kernel) and process packets – Configure switch and routes so that two input ports (P1, P5) get a copy of the traffic to be

routed.– Configure the two input ports (P1, P5) routes so that they route the traffic to Egress port 0

• Overload APIC processing in Kernel on Port 0– send 60 Mb/s at each input port

• using sendpkts: sends batches of packets– total of 120 Mb/s at output port– pkt sz = 1500 bytes (

• Kernel error messages:– RX CID (65 and 69) out of descriptors

• indicates: we are sending more data at the kernel than it can handle– Bad CRC

• indicates: cells are being dropped somewhere– either APIC or SPC-II FPGA. – Probably APIC, if it was FPGA, it would flow control switch– but we may not be sending enough for FC to back up all the way through OPP buffer.

• But no cells are dropped in OPP– indicates: SPC-II FPGA is not flow controlling switch

• System remains stable:– ran for several minutes in this state with no stall or hang.

• Increase rate to 80 Mb/s at each input port– system continues to remain stable– no drops in OPP

Page 16: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 16WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred’s Kernel: sendpkts –B 40 –p 10 –a 20 –c -S

Page 17: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 17WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred’s Kernel: sendpkts –B 80 –p 10 –a 20 –c -S

Page 18: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 18WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Analysis of previous screen dump• P0 has stopped sending any pkts out to the link• P0 has stopped back pressuring the switch• APIC interrupts still being generated and counted by

kernel (vmstat –i)• APIC still counting cells arriving• APIC NOT counting cells on PCI bus

• APIC thinks it is getting cells and generating Interrupts.• What does the kernel think in this state?

– channel is suspended and needs resuming...• when resumed things start working again.

– This is probably the “Ready descriptor” error• So in this state the APIC is out of descriptors and all of its

cell buffers are probably full. – Is it just continually generating ERROR interrupts?– And discarding every cell it receives (after counting it)?

Page 19: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 19WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

JDD’s version of Fred’s Kernel:sendpkts –c –v –S –a 20 –x 8000

BE Resume of P0 channel 80 resumes data output.

Page 20: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 20WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Analysis of previous screen dump

• After resuming BE twice (each worked), the third time it “stalled” the kernel had crashed.– panic: kernel assertion “0” failed: apic.c, line 1045

– This assert is checking that a TX descriptor being allocated from the free list has SYNC bits set to NOT_READY.

• need to repeat the test with proper debug turned on so we can see what descriptor it is and what the sync bits are actually set to.

• Repeated, after 8 successful resume BE:Port 0 (APIC/Crit): msr_apic_txdesc_alloc: Desc->MatchFlags !=

DESC_SYNC_NOT_READY!, offset = 15851 sync = 0

panic: kernel assertion “0” failed: file “../../../../dev/ic/apic.c” line 1045

Page 21: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 21WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

APIC errors detected:APIC errors that occurred during different runs.------------------------------------------------------

Port -1 (Ctl/Info): msr_process_ctlcell: cmd 0x1, ver 0, seq 0, len 4, flags 0x9Port 0 (APIC/Error): apic_intr: Unexpected RX Error on CID = 65, chanstatus = 0x07apic0: Descriptor Error: Match incorrect (not 0xcafe) 0x07

------------------------------------------------------

Port 0 (APIC/Crit): msr_free_txdescs: Invalid tx desc index (current 14250 or next 128)panic: kernelassertion "((((txindx) >= ((0x00000001 + 8192 - 1) + 1)) && ((txindx) <= (((0x00000001 + 8192 - 1) + 1)+ 8192 - 1))) && (((nextindx) >= ((0x00000001 + 8192 - 1) + 1)) && ((nextindx) <= (((0x00000001 + 8192 -1) + 1) + 8192 - 1))))" failed: file "../../../../dev/ic/apic.c", line 2332

Stopped at 0xf018ff8c: leavedb>

------------------------------------------------------

Page 22: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 22WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

State when we set debug and get stats: causes the xmit channel to come alive again!

Page 23: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 23WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred’s Kernel: sendpkts –B 80 –p 10 –a 20 –c –Ssendpkts has stopped sending data…???

Page 24: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 24WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Fred’s Kernel: sendpkts –c –v –S –a 20 –x 8000

Page 25: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 25WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Flow Control Test #3

• Send data from Switch to SPC-II– Load kernel and process packets

– Configure classifier and data pkts so they are dropped• i.e. no route for destination address.

• Overload APIC processing in Kernel

• Turn data on/off repeatedly

Page 26: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 26WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

SPC-I System FPGA

• Supported:– Four Interrupts supported and statically assigned:

• PIT (IRQ 0)

• APIC (IRQ 5)

• COM1 (IRQ 4)

• COM2 (IRQ 3)

– Static fully-nested interrupt priority structure.

– Specific End of Interrupt is the only EOI mode supported

• Not Supported:– Special Mask Mode

– Automatic End of Interrupt (AUTO_EOI_1, AUTO_EOI_2)

– Special Fully Nested Mode

Page 27: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 27WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

SPC-II Interrupts

• Supported by a real Southbridge/ICU

• FPGA provides flow control– but with the traffic patterns and rates we are using there should be

no flow control asserted.

Page 28: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 28WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Hardware Interrupt Structure (Ignoring Bus)

CPU

APIC

ICU ACK

INTR

ACK

MASK/UNMASK

INTR

Page 29: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 29WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Overview of what happens• APIC generates INTR to ICU

– Apic will not generate another INTR until ACKed• ICU pushes INTR(IRQ) onto Bus

– ICU will only send higher priority interrupts• CPU gets INTR

– MASK IRQ in ICU• ICU will not send this IRQ again

– ACK IRQ in ICU• Allows lower priority interrupts from ICU

– Check priority and hold if lower than current– Call APIC inter handler

• ACK Intr in APIC– APIC can generate another INTR to ICU

• Intr processing…– process all packets that have been received– put packets being forwarded on transmit queue and resume transmit queue if needed

• Return– UNMASK IRQ in ICU

• ICU can send us this IRQ again– Check for other pending (held) interrupts.– RETI (expand…)

Page 30: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 30WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s#include "opt_ddb.h"

#include <i386/isa/icu.h>#include <dev/isa/isareg.h>

#define ICU_HARDWARE_MASK

#define IRQ_BIT(irq_num) (1 << ((irq_num) % 8))#define IRQ_BYTE(irq_num) ((irq_num) / 8)

#ifdef ICU_SPECIAL_MASK_MODE // SPC System FPGA does not support SMM#define ACK1(irq_num)#define ACK2(irq_num) \

movb $(0x60|IRQ_SLAVE),%al /* specific EOI for IRQ2 */ ;\outb %al,$IO_ICU1

#define MASK(irq_num, icu)#define UNMASK(irq_num, icu) \

movb $(0x60|(irq_num%8)),%al /* specific EOI */ ;\outb %al,$icu

Page 31: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 31WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s#else /* I.E. NOT ICU_SPECIAL_MASK_MODE */

#ifndef AUTO_EOI_1#define ACK1(irq_num) \

movb $(0x60|(irq_num%8)),%al /* specific EOI */ ;\outb %al,$IO_ICU1

#else#define ACK1(irq_num)#endif

#ifndef AUTO_EOI_2#define ACK2(irq_num) \

movb $(0x60|(irq_num%8)),%al /* specific EOI */ ;\outb %al,$IO_ICU2 /* do the second ICU first */ ;\movb $(0x60|IRQ_SLAVE),%al /* specific EOI for IRQ2 */ ;\outb %al,$IO_ICU1

#else#define ACK2(irq_num)#endif

Page 32: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 32WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s#ifdef ICU_HARDWARE_MASK#define MASK(irq_num, icu) \

movb _C_LABEL(imen) + IRQ_BYTE(irq_num),%al /* imen: interrupt mask enable (2 bytes)*/orb $IRQ_BIT(irq_num),%al /* mask our irq (put a 1 in its place) */movb %al,_C_LABEL(imen) + IRQ_BYTE(irq_num)FASTER_NOPoutb %al,$(icu+1) /* write it to the ICU */

#define UNMASK(irq_num, icu) climovb _C_LABEL(imen) + IRQ_BYTE(irq_num),%alandb $~IRQ_BIT(irq_num),%almovb %al,_C_LABEL(imen) + IRQ_BYTE(irq_num)FASTER_NOPoutb %al,$(icu+1)sti

#else /* ICU_HARDWARE_MASK */#define MASK(irq_num, icu)#define UNMASK(irq_num, icu)

#endif /* ICU_HARDWARE_MASK */

#endif /* ICU_SPECIAL_MASK_MODE */

Page 33: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 33WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s#ifdef __ELF__

#define XINTR(irq_num) Xintr/**/irq_num

#define XHOLD(irq_num) Xhold/**/irq_num

#define XSTRAY(irq_num) Xstray/**/irq_num

#else

#define XINTR(irq_num) _Xintr/**/irq_num

#define XHOLD(irq_num) _Xhold/**/irq_num

#define XSTRAY(irq_num) _Xstray/**/irq_num

#endif

Page 34: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 34WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s/* Beginning of INTR Macro */

#define INTR(irq_num, icu, ack)

IDTVEC(resume/**/irq_num)

cli

jmp 1f

IDTVEC(recurse/**/irq_num)

pushfl

pushl %cs

pushl %esi

cli

Block the CPU from accepting any more interrupts.

Page 35: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 35WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.sXINTR(irq_num):

pushl $0 /* dummy error code */

pushl $T_ASTFLT /* trap # for doing ASTs */

INTRENTRY

MAKE_FRAME

MASK(irq_num, icu) /* mask it in hardware */

ack(irq_num) /* and allow other intrs */

incl MY_COUNT+V_INTR /* statistical info */

ICU will not send us

anymore of this IRQ

ACK this IRQ to the ICU. Allows it to

generate other interrupts.

Without this the ICU would only generate higher priority

interrupts

When an interrupt occurs the CPU will clear the

interrupt enable bit (equivalent of cli)

An iret restores the bit.

Page 36: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 36WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.stestb $IRQ_BIT(irq_num),_C_LABEL(cpl) + IRQ_BYTE(irq_num)

jnz XHOLD(irq_num) /* currently masked; hold it */

1: movl _C_LABEL(cpl),%eax /* cpl to restore on exit */

pushl %eax

orl _C_LABEL(intrmask) + (irq_num) * 4,%eax

movl %eax,_C_LABEL(cpl) /* add in this intr's mask */

sti /* safe to take intrs now */

In Kernel interrupt

mask

Allow CPU to accept more interrupts.

Pre-computed masks for each IRQ

IRQ 0: 0xe0000021IRQ 3: 0xe0000039IRQ 4: 0xe0000039IRQ 5: 0xc0000020

0 0 0 0 0 0 0 0 bits 5 4 3 2 1 0 irq

Add IRQ bit to ipending

Page 37: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 37WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.smovl _C_LABEL(intrhand) + (irq_num) * 4,%ebx /* head of chain */

testl %ebx,%ebx

jz XSTRAY(irq_num) /* no handlers; we're stray */

STRAY_INITIALIZE /* nobody claimed it yet */

incl _C_LABEL(intrcnt) + (4*(irq_num)) /* XXX */

Page 38: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 38WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s7: movl IH_ARG(%ebx),%eax /* get handler arg */

testl %eax,%eax

jnz 4f

movl %esp,%eax /* 0 means frame pointer */

4: pushl %eax

call IH_FUN(%ebx) /* call it */

addl $4,%esp /* toss the arg */

STRAY_INTEGRATE /* maybe he claimed it */

incl IH_COUNT(%ebx) /* count the intrs */

movl IH_NEXT(%ebx),%ebx /* next handler in chain */

testl %ebx,%ebx

jnz 7b

STRAY_TEST /* see if it's a stray */

5: UNMASK(irq_num, icu) /* unmask it in hardware */

jmp _C_LABEL(Xdoreti) /* lower spl and do ASTs */

Call NetBSD Interrupt Handler

ICU is now able to send us another

interrupt for this IRQ

Locate a handler for this IRQ

Return from Interrupt: Resume other interruptsCheck for pending interruptsRestore stackiret

Page 39: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 39WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.sIDTVEC(stray/**/irq_num)

pushl $irq_num

call _C_LABEL(isa_strayintr)

addl $4,%esp

incl _C_LABEL(strayintrcnt) + (4*(irq_num))

jmp 5b

IDTVEC(hold/**/irq_num) // XHOLD()

orb $IRQ_BIT(irq_num),_C_LABEL(ipending) + IRQ_BYTE(irq_num)

INTRFASTEXIT

/* End of INTR Macro */

Page 40: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 40WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.sINTR(0, IO_ICU1, ACK1) /* Clock interrupt */

INTR(1, IO_ICU1, ACK1)

INTR(2, IO_ICU1, ACK1)

INTR(3, IO_ICU1, ACK1) /* COM 2 Interrupt */

INTR(4, IO_ICU1, ACK1) /* Com 1 Interrupt */

INTR(5, IO_ICU1, ACK1) /* APIC Interrupt */

INTR(6, IO_ICU1, ACK1)

INTR(7, IO_ICU1, ACK1)

INTR(8, IO_ICU2, ACK2)

INTR(9, IO_ICU2, ACK2)

INTR(10, IO_ICU2, ACK2)

INTR(11, IO_ICU2, ACK2)

INTR(12, IO_ICU2, ACK2)

INTR(13, IO_ICU2, ACK2)

INTR(14, IO_ICU2, ACK2)

INTR(15, IO_ICU2, ACK2)

Page 41: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 41WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s/*Add a mask to cpl, and return the old value of cpl.*/

static __inline int

splraise(ncpl)

register int ncpl;

{

register int ocpl = cpl;

cpl = ocpl | ncpl;

return (ocpl);

}/* Restore a value to cpl (unmasking interrupts).

* If any unmasked interrupts are pending,

* call Xspllower() to process them.*/

static __inline void

splx(ncpl)

register int ncpl;

{

cpl = ncpl;

if (ipending & ~ncpl)

Xspllower();

}

/*Same as splx(), but we return the old value of spl, for the * benefit of some splsoftclock() callers.*/

static __inline intspllower(ncpl)

register int ncpl;{

register int ocpl = cpl;cpl = ncpl;if (ipending & ~ncpl)

Xspllower();return (ocpl);

}

Call Xspllower if there is something

pending that is higher priority then

our new cpl

Page 42: 9/23/2003 2:11 PM MSR Transmit Stall 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS APIC Stalling problem Notes with additional notes on Interrupts John.

9/23/2003 2:11 PM MSR Transmit Stall 42WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/icu.s: spllower()IDTVEC(spllower) // Xspllower()

pushl %ebx

pushl %esi

pushl %edi

movl _C_LABEL(cpl),%ebx # save priority

movl $1f,%esi # address to resume loop at

1: movl %ebx,%eax

notl %eax

andl _C_LABEL(ipending),%eax

jz 2f

bsfl %eax,%eax

btrl %eax,_C_LABEL(ipending)

jnc 1b

jmp *_C_LABEL(Xrecurse)(,%eax,4)

2: popl %edi

popl %esi

popl %ebx

ret

Is there a pending

interrupt that is high enough priority?

If yes, then restart it?


Recommended