Post on 06-Mar-2018
transcript
Troubleshooting Cisco Catalyst 6500 / 6800 Series Switches
BRKCRS-3143
Yogesh Ramdoss, Technical Leader, Cisco Services
yramdoss@cisco.com
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Agenda
• Architecture: Sup720 vs. Sup2T
• Troubleshooting Unicast Forwarding
• Troubleshooting Multicast Forwarding
• Troubleshooting High CPU Utilization
3
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Goal of this Session …..
Teach commonly used techniques and commands to troubleshoot Cisco Catalyst 6500/6800 switches and …... make it less of a BLACK BOX !!
4
We can troubleshoot it !!
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Why this session covers both Catalyst 6500 and 6800 ?
• Catalyst 6800 is the foundation of Instant Access (IA) solution.
• Catalyst 6800 switch and modules are built on Sup2T/PFC4 architecture
Take Away: Whatever we learn in this session is applicable to Catalyst 6500 Sup2T standalone, VSS and Catalyst 6800 Instant Access Solution.
5
Catalyst 6500 Series
Catalyst 6800 Series
Recommended Session: BRKCRS-3148 - Advanced
Cisco Catalyst 6500 / 6800 Series Troubleshooting
Architecture: Sup720 vs. Sup2T
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Acronyms Legend PFC: Policy Feature Card
DFC: Distributed Forwarding Card
FE: Forwarding Engine
CAM: Content Addressable Memory
TCAM: Ternary or Tertiary CAM
FIB: Forwarding Information Base
ACL: Access Control List
ACE: Access Control Entry
EOBC: Ethernet Out-of-Band Channel
BD: Bridge Domain
LIF: Logical Interface
CoPP: Control Plane Policing
FPOE: Fabric Port of Exit
7
Reference slide
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Supervisor 720/PFC3 Architecture
8
DBUS
PFC3
SFP SFP /
GETX
Traces # 1 to 16 RBUS EOBC
MSFC 3 Flash
DRAM
Flash
DRAM MET
QoS TCAM
FIB TCAM
NetFlow
16 Gbps Bus
L3/4 Engine
L2 Engine
Adj TCAM
L2 CAM (64K)
ACE Counter
ACL TCAM
20 Gbps
Fabric Interface
and
Replication Engine
Switch Fabric
18 x 20G Traces
1 Gbps SP
CPU
1 Gbps RP
CPU Port ASIC
Layer3 Control-plane
E.g., OSPF, BGP, SNMP
L3/4 forwarding
L2 forwarding
Replication engine
E.g., Multicast, SPAN
Integrated Switch Fabric
Layer2 Control-plane
E.g., LACP, BPDU and
hardware programming
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
DBUS
PFC4
1GE / 10GE
Uplinks
Traces # 1 to 24 RBUS EOBC
MSFC 5
Flash DRAM
CL1 TCAM
NetFlow
L3/4 Engine
L2 Engine
L2 CAM (128K)
ACE Counter
CL2 TCAM
40 Gbps
Fabric Interface
&
Replication Engine Switch Fabric
26 x 40G Traces
2 Gbps Central
Management
Processor
CPU
Port ASIC
MET
FIB TCAM
ADJ TCAM
LIF Table
LIF Stats
RPF Table
LIF MAP
MSFC5 Complex contains single
dual-core CPU for both Layer 2 and
Layer 3 control-plane protocols and
hardware programming
L3/4 forwarding
L2 forwarding
Replication engine
E.g., Multicast, SPAN
Integrated Switch Fabric
Supervisor 2T/PFC4 Architecture
9
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Supervisor Engine Sup2T
• Sup2T Architecture – White Paper: http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/white_paper_c11-676346.html
• Sup2T FAQs: http://www.cisco.com/en/US/prod/collateral/modules/ps2797/ps11878/qa_c67-648478.html
• Catalyst 6500 Ethernet Modules Data Sheet: http://www.cisco.com/en/US/products/hw/switches/ps708/products_data_sheets_list.html#anchor3
• Cisco Catalyst 6500 Sup720 to Sup2T engines: http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/guide_c07-717261.html
• Cisco Catalyst 6800 Data Sheets and Literature: http://www.cisco.com/c/en/us/products/switches/catalyst-6800-series-switches/literature.html
• Cisco Catalyst 6800 Series Switches – Support Page: http://www.cisco.com/c/en/us/support/switches/catalyst-6800-series-switches/tsd-products-support-series-home.html
Reference Materials
10
Troubleshooting Unicast Forwarding
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Troubleshooting Unicast Forwarding
• L2 Topology and Packet Flow
• L2 Packet Flow Troubleshooting L2 CAM, Interface counters/errors, Switch Fabric
• L3 Topology and Packet Flow
• L3 Packet Flow Troubleshooting FIB and Adjacency TCAM
Agenda
12
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Troubleshooting Unicast Forwarding
• L2 Topology and Packet Flow
• L2 Packet Flow Troubleshooting L2 CAM, Interface counters/errors, Switch Fabric
• L3 Topology and Packet Flow
• L3 Packet Flow Troubleshooting FIB and Adjacency TCAM
Agenda
13
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
L2 Topology
14
• DUT is the Device Under Test we are troubleshooting
• DUT is a 6509-E with Supervisor 2T
• Four TenGigabitEthernet L2 Etherchannel (R1 DUT)
• Four TenGigabitEthernet L2 Etherchannel (DUT R2)
Po11 Ten1/4 Ten1/1
Ten1/2 Ten1/8
Ten2/1 Ten1/5
Ten2/2 Ten1/7
Po11
R1 DUT R2
Po12 Ten1/5 Ten1/3
Ten1/4 Ten1/6
Ten1/7 Ten2/5
Ten1/8 Ten2/6
Po12
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
L2 Unicast Traffic
Sup2T# show ip arp
Protocol Address Age (min) Hardware Addr Type Interface
Internet 192.168.10.1 - b414.8961.3780 ARPA Vlan10
Internet 192.168.10.2 31 0006.5bbc.81a2 ARPA Vlan10
Internet 192.168.10.3 32 0006.5bbc.7acb ARPA Vlan10
Traffic Configuration
15
Po11
Ten1/4 Ten1/1
Ten1/2 Ten1/8
Ten2/1 Ten1/5
Ten2/2 Ten1/7
Po11
R1 DUT R2
Po12
Ten1/5 Ten1/3
Ten1/4 Ten1/6
Ten1/7 Ten2/5
Ten1/8 Ten2/6
Po12
Host 1
192.168.0.2 Host 2
192.168.0.3
Vlan 10
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Both MAC addresses are learned on Port-Channels; Which physical link in the
channel Is actually receiving the packets?
Sup2T# show mac address-table address 0006.5bbc.81a2
Legend: * - primary entry
age - seconds since last seen
n/a - not available
S - secure entry
R - router's gateway mac address entry
D - Duplicate mac address entry
Displaying entries from DFC linecard [1]:
vlan mac address type learn age ports
----+----+---------------+-------+-----+----------+--------
* 10 0006.5bbc.81a2 dynamic Yes 5 Po11
Displaying entries from DFC linecard [2]:
vlan mac address type learn age ports
----+----+---------------+-------+-----+----------+--------
* 10 0006.5bbc.81a2 dynamic Yes 90 Po11
L2 Unicast Traffic Where are the MAC Addresses Learned?
16
Host 1
Host 2
Sup2T# show mac address-table address 0006.5bbc.7acb
Displaying entries from DFC linecard [1]:
vlan mac address type learn age ports
----+----+---------------+-------+-----+----------+--------
* 10 0006.5bbc.7acb dynamic Yes 0 Po12
Displaying entries from DFC linecard [2]:
vlan mac address type learn age ports
----+----+---------------+-------+-----+----------+--------
* 10 0006.5bbc.7acb dynamic Yes 110 Po12
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
L2 Unicast Traffic Which Link in the EtherChannel Is Being Used?
Po11 Ten1/4 Ten1/1
Ten1/2 Ten1/8
Ten2/1 Ten1/5
Ten2/2 Ten1/7
Po11
R1 DUT R2
Po12 Ten1/5 Ten1/3
Ten1/4 Ten1/6
Ten1/7 Ten2/5
Ten1/8 Ten2/6
Po12
Host 1
192.168.0.2 Host 2
192.168.0.3
Gig4/1
R1#show etherchannel load-balance module 4
EtherChannel Load-Balancing Configuration:
src-dst-ip vlan included
mpls label-ip
EtherChannel Load-Balancing Addresses Used Per-Protocol:
Non-IP: Source XOR Destination MAC address
IPv4: Source XOR Destination IP address
IPv6: Source XOR Destination IP address
MPLS: Label or IP
R1# show etherhannel load-balance interface po11 ip 192.168.0.2 192.168.0.3
Computed RBH: 0x3
Would select Te1/8 of Po11
Mode is “src-dst-ip”. Only use src and dest
IP as argument. Prior to 12.2(33)SXH, use
test etherchannel load-balance …(same
arguments) on the SP, for Sup720 engines.
Link selected is Ten1/8 in Po11 of R1 for traffic to 192.168.0.3
Check load balancing configuration
Use ingress Module number in command
in case per-module load-balancing is
configured (SXH images and later)
17
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
L2 Unicast Traffic Network Path Verification: Result
Po11 Ten1/4 Ten1/1
Ten1/2 Ten1/8
Ten2/1 Ten1/5
Ten2/2 Ten1/7
Po11
R1 DUT R2
Po12 Ten1/5 Ten1/3
Ten1/4 Ten1/6
Ten1/7 Ten2/5
Ten1/8 Ten2/6
Po12
Host 1
192.168.0.2 Host 2
192.168.0.3
Gig4/1
Po11 Ten1/4 Ten1/1
Ten1/2 Ten1/8
Ten2/1 Ten1/5
Ten2/2 Ten1/7
Po11
R1 DUT R2
Po12 Ten1/5 Ten1/3
Ten1/4 Ten1/6
Ten1/7 Ten2/5
Ten1/8 Ten2/6
Po12
Host 1
192.168.0.2 Host 2
192.168.0.3
Gig4/1
Each direction can use different links in the bundles !
18
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Troubleshooting Unicast Forwarding
• L2 Topology and Packet Flow
• L2 Packet Flow Troubleshooting L2 CAM, Interface counters/errors, Switch Fabric
• L3 Topology and Packet Flow
• L3 Packet Flow Troubleshooting FIB and Adjacency TCAM
Agenda
19
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Layer 2 Learning and Forwarding
20
• In Sup720 engines, Layer 2 forwarding is based on {VLAN, MAC} pairs. In Sup2T engines, {port_index, MAC} is used by Bridge Domain(BD) for Bridging and by Logical Interface (LIF) for routing.
• MAC learning is done per PFC or DFC – Each PFC/DFC maintains separate L2 CAM table
• PFC and DFCs age entries independently – Refreshing of entries based on “seeing” traffic from specific host
– New learns on one forwarding engine communicated to other engines via MAC-Sync process (which occurs over EOBC)
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L2 Packet Flow Troubleshooting Are We Learning MAC Addresses?
21
Sup2T# show mac address-table address 0006.5bbc.7acb vlan 10 [all]
Legend: * - primary entry
age - seconds since last seen
n/a - not available
S - secure entry
R - router's gateway mac address entry
D - Duplicate mac address entry
Displaying entries from DFC linecard [1]:
vlan mac address type learn age ports
----+----+---------------+-------+-----+----------+---------
* 10 0006.5bbc.7acb dynamic Yes 0 Po12
Displaying entries from DFC linecard [2]:
vlan mac address type learn age ports
----+----+---------------+-------+-----+----------+----------
* 10 0006.5bbc.7acb dynamic Yes 185 Po12
Displaying entries from active supervisor:
vlan mac address type learn age ports
----+----+---------------+-------+-----+----------+----------
10 0006.5bbc.7acb dynamic Yes 205 Po12
By default, Sup2T prints entries
from all DFCs. In Sup720, use
all keyword to see entry from all
DFCs in the system.
* Denotes the primary
forwarding entry. This entry is owned by ingress forwarding engine for frames sourced from that ethernet address.
Flooding can occur if
MACs are not known
by ALL FEs in the
system
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L2 Packet Flow Troubleshooting Verify L2 Tables: MAC Sync Feature
22
Sup2T# show mac address-table synchronize statistics
MAC Entry Out-of-band Synchronization Feature Statistics:
---------------------------------------------------------
Module [1]
-----------
Module Status:
Statistics collected from module : 1
Global Status:
Status of feature enabled on the switch : on
Default activity time : 160
Configured current activity time : 160
Statistics from ASIC 0 when last activity timer expired:
Age value in seconds from age byte register : 0x4C
<snip>
Number of entries created new : 377
Number of entries create failed : 0
Module [2]
-----------
Module Status:
Statistics collected from module : 2
Global Status:
Status of feature enabled on the switch : on
Off by default in Sup720. When WS-X6708 is present, it is on by default, and set the mac aging timer to 480 sec. Why 480 ?
Default value is 160 seconds; normal aging timer should be at least 3x activity interval … so with default of 160 sec, change mac aging timer to 480 sec or more
Number of entries that were synced by SW sync feature
Out-of-Band (OOB) MAC-Sync feature is enabled by default in Sup2T. By default, it is disabled in Sup720. Flooding can occur when L2 CAM tables are not in sync. Enable this feature with “mac-address-table synchronize” command (under “config t”) in Sup720.
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Ten2/5
DFC4
WS-X6908
Module 2
Port
ASIC
Fabric
Interface &
Replication
Engine
MET
Port
ASIC
Switch Fabric
WS-X6908
Module 1
Port
ASIC
Fabric
Interface &
Replication
Engine
MET
Port
ASIC
DFC4
Layer 3/4
Engine
Layer 2
Engine
Layer 2
Engine
Layer 3/4
Engine
Ten1/1
Look at the interface
counters and errors
for the ingress and
egress interfaces
Host1
Host2
Check the L2
forwarding engine
counters
Verify the fabric
channels used in the
flow
Ten1/2
Ten2/6
Detailed L2 Packet Flow Troubleshooting
23
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L2 Packet Flow Troubleshooting Verify L2 Counters: Interface Counters
24
Sup2T# show interface ten 1/2 counters
Port InOctets InUcastPkts InMcastPkts InBcastPkts
Te1/2 249784 2000 8 40
Port OutOctets OutUcastPkts OutMcastPkts OutBcastPkts
Te1/2 83246 18 6 0
Sup2T#show interface ten 1/1 counters
Port InOctets InUcastPkts InMcastPkts InBcastPkts
Te1/1 10590 18 28 0
Port OutOctets OutUcastPkts OutMcastPkts OutBcastPkts
Te1/1 246412 2008 10 0
Sup2T#show interface ten 2/5 counters
Port InOctets InUcastPkts InMcastPkts InBcastPkts
Te2/5 2890 2890 0 0
Port OutOctets OutUcastPkts OutMcastPkts OutBcastPkts
Te2/5 273441 2032 11 0
And, similarly on Ten2/5
Did a ping (2000 packets/100 bytes per packet) from 192.168.0.2 to 192.168.0.3.
verify interface counters relevant to the path did move sufficiently !!
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
shows
interface level
packet counts
and errors
since last time
clear counters
was issued.
Hardware counters. Not cleared by
“clear counters” command.
Detailed L2 Packet Flow Troubleshooting Verify L2 Counters: Interface Counters
25
Sup2T# show interface ten1/1 counter error
Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
Te1/1 0 0 0 0 0 0
Port Single-Col Multi-Col Late-Col Excess-Col Carri-Sen Runts Giants
Te1/1 0 0 0 0 0 0 0
Port SQETest-Err Deferred-Tx IntMacTx-Err IntMacRx-Err Symbol-Err
Te1/1 0 0 0 0 0
Sup2T# clear counters
Sup2T# show counter interface te1/1 delta
Time since last clear
---------------------
00:00:02
64 bit counters:
0. rxHCTotalPkts = 1
1. txHCTotalPkts = 3
2. rxHCUnicastPkts = 0
3. txHCUnicastPkts = 0
<snip>
Sup2T# show counter interface te1/1
<snip>
64 bit counters:
0. rxHCTotalPkts = 13021673
1. txHCTotalPkts = 3090200
2. rxHCUnicastPkts = 2684645
3. txHCUnicastPkts = 2684649
<snip>
.... nearly 140 counters ...
shows traffic statistics since last clear
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L2 Packet Flow Troubleshooting Verify L2 Counters: L2 Forwarding Engine VLAN Count
26
Sup2T #show vlan id 10 counters
* L2 counters include multicast and broadcast packets
Vlan Id : 10
L2 Unicast Packets : 4012
L2 Unicast Octets : 401868
L3 Input Unicast Packets : 0
L3 Input Unicast Octets : 0
L3 Output Unicast Packets : 0
L3 Output Unicast Octets : 0
L3 Output Multicast Packets : 0
L3 Output Multicast Octets : 0
L3 Input Multicast Packets : 0
L3 Input Multicast Octets : 0
L2 Multicast Packets : 0
L2 Multicast Octets : 0
VLAN is bidirectional, so counts
both directions of the flow
(192.168.0.2 192.168.0.3)
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Sup2T# show fabric fpoe map
slot channel logical fpoe physical fpoe
1 0 0 5
1 1 1 5
1 2 32 6
1 3 33 6
2 0 2 11
2 1 3 11
2 2 34 7
2 3 35 7
<snip>
For each ingress and
egress interface: find
mapping between
interface and Fabric
Port Of Exit (FPOE).
Detailed L2 Packet Flow Troubleshooting Identifying the Fabric Channels
27
Sup2T# sh fabric fpoe interface ten1/1
fpoe for TenGigabitEthernet1/1 is 1
Sup2T# sh fabric fpoe interface ten1/2
fpoe for TenGigabitEthernet1/2 is 1
Sup2T# sh fabric fpoe interface ten2/5
fpoe for TenGigabitEthernet2/5 is 34
Sup2T# sh fabric fpoe interface ten2/6
fpoe for TenGigabitEthernet2/6 is 34 Find mapping between
FPOE and Slot/Channel
(requires service internal
config under config t)
Ten2/5
DFC4
WS-X6908
Module 2
Port
ASIC
Fabric
Interface &
Replication
Engine
MET
Port
ASIC
Switch Fabric
WS-X6908
Module 1
Port
ASIC
Fabric
Interface &
Replication
Engine
MET
Port
ASIC
DFC4
Layer 3/4
Engine
Layer 2
Engine
Layer 2
Engine
Layer 3/4
Engine
Ten1/1
Host1
Host2
Ten1/2
Ten2/6
FPOE 1
FPOE 34
Fabric ASIC
Fabric Interface ASIC
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L2 Packet Flow Troubleshooting Verify L2 Counters: Switching Fabric Utilization
28
Sup2T# show fabric utilization detail
Fabric utilization: Ingress Egress
Module Chanl Speed rate peak rate peak
1 0 40G 0% 0% 0% 0%
1 1 40G 0% 1% @15:47 21Feb12 0% 0%
2 0 40G 0% 0% 0% 0%
2 1 40G 0% 0% 0% 1% @02:34 22Feb12
5 0 20G 0% 0% 0% 0%
5 1 20G 0% 0% 0% 0%
<snip>
Sup2T# show fabric status
slot channel speed module fabric hotStandby Standby Standby
status status support module fabric
1 0 40G OK OK Y(not-hot)
1 1 40G OK OK Y(not-hot)
2 0 40G OK OK Y(not-hot)
2 1 40G OK OK Y(not-hot)
5 0 20G OK OK N/A
5 1 20G OK OK N/A
6 0 20G OK OK N/A
6 1 20G OK OK N/A
Check utilization (current and last peak
value) for relevant fabric channels … did
any peak coincide with moment of drops?
Check status of fabric channels is
OK. An example for misbehaving
module or fabric channel: Module
status is reported as “DDR Sync”
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L2 Packet Flow Troubleshooting Verify L2 Counters: Relevant Fabric Channel Counters
29
Sup2T# show fabric channel-counters 1
slot channel rxErrors txErrors txDrops lbusDrops
1 0 0 0 0 0
1 1 0 0 0 0
Sup2T# show fabric channel-counters 2
slot channel rxErrors txErrors txDrops lbusDrops
2 0 0 0 0 0
2 1 0 0 0 0
Sup2T# show fabric errors 1
Module errors:
slot channel crc hbeat sync DDR sync
1 0 0 0 0 0
1 1 0 0 0 0
Fabric errors:
slot channel sync buffer timeout
1 0 0 0 0
1 1 0 0 0
fabric ASIC unable to send traffic to the fabric enabled module for last +3 seconds
fabric serial link bit errors (8 serial links in
each fabric channel), reported as soon as 2
fabric serial link interrupts within 100ms; can
result in rxErrors / txErrors; check card
inserted OK ?
line card fabric ASIC reports bad
packets: card inserted properly ? A
few incrementing ‘rxErrors', which is
not correlated to any network events,
is OK & acceptable.
unable to send packets from fabric to line
card: Check traffic levels, line card OK ?
fabric interface unable to send
packets from local bus to fabric
(Supervisor and 65XX modules only,
67XX and above report Overruns in
“show interface” results. check traffic
levels, signs of congestion ?
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Troubleshooting Unicast Forwarding
• L2 Topology and Packet Flow
• L2 Packet Flow Troubleshooting L2 CAM, Interface counters/errors, Switch Fabric
• L3 Topology and Packet Flow
• L3 Packet Flow Troubleshooting FIB and Adjacency TCAM
Agenda
30
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
L3 Unicast Traffic Network Configuration
• DUT is the Device Under Test we are troubleshooting
• DUT is a 6509-E with Supervisor 2T
• Four TenGigabitEthernet L2 Etherchannel Trunk (R1 DUT)
Vlan 10, 20, 30 and 40 are assigned with 192.168.10.0/24, 192.168.20.0/24, 192.168.30.0/24 and 192.168.40.0/24 subnets respectively.
• Four L3 Links (DUT R2)
Four links are assigned with 172.16.10.0/24, 172.16.20.0/24, 172.16.30.0/24 and 172.16.40.0/24 subnets respectively.
31
Po11 Ten1/4 Ten1/1
Ten1/2 Ten1/8
Ten2/1 Ten1/5
Ten2/2 Ten1/7
Po11
R1 DUT R2
Ten1/5 Ten1/3
Ten1/4 Ten1/6
Ten1/7 Ten2/5
Ten1/8 Ten2/6
VLANS 10,20,30 and 40
Host2
100.100.100.1
Host1
L3 Links 200.200.200.1
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
L3 Unicast Traffic Different Switching Paths for L3 Traffic in Catalyst 6500/6800
32
Process Switching Path
Software-based CEF Switching Path
Hardware-based CEF switching Path
DUT
This slide is just a logical representation of
different switching paths (also known as
Switching Vectors) in Catalyst 6500/6800.
Host1 Host2
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
L3 Unicast Traffic Host 1 Host 2: Which L3 Next Hop / L2 Link from R1?
33
SW R1# show ip route 200.200.200.1
Routing entry for 200.200.200.1/32
Known via "ospf 100", distance 110, metric 3, type intra area
Last update from 192.168.40.1 on Vlan40, 00:10:12 ago
Routing Descriptor Blocks:
192.168.40.1, from 192.168.0.2, 00:10:12 ago, via Vlan40
Route metric is 3, traffic share count is 1
192.168.30.1, from 192.168.0.2, 00:10:12 ago, via Vlan30
Route metric is 3, traffic share count is 1
* 192.168.20.1, from 192.168.0.2, 00:10:12 ago, via Vlan20
Route metric is 3, traffic share count is 1
192.168.10.1, from 192.168.0.2, 00:10:12 ago, via Vlan10
Route metric is 3, traffic share count is 1
R1# show ip cef exact-route 100.100.100.1 200.200.200.1
100.100.100.1 -> 200.200.200.1 => IP adj out of Vlan40, addr 192.168.40.1
R1# show mls cef exact-route 100.100.100.1 0 200.200.200.1 0
Interface: Vl10, Next Hop: 192.168.20.1, Vlan: 10, Destination Mac: b414.8961.3780
R1# show etherchannel load-bal int port-ch 11 ip 100.100.100.1 200.200.200.1
Computed RBH: 0x7
Would select Te1/8 of Po11
Next hop used for HW based CEF
(HW forwarding path). Note: “0” is
used for both src and dest L4 port
numbers as test flow was ICMP echo
Check which link between R1 and
DUT is chosen. HW
Equal Cost Routes to the
destination prefix
Next hop used for SW based
CEF (SW forwarding data path)
Note: R1 is a
Cat6500 with
Sup720, which
supports “mls”
commands.
* denotes the path it takes for the next process-
switched traffic. It moves in a round-robin fashion,
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
L3 Unicast Traffic Host 1 Host 2: Which L3 Next Hop from DUT?
34
Sup2T# show ip route 200.200.200.1
Routing entry for 200.200.200.1/32
Known via "ospf 100", distance 110, metric 2, type intra area
Last update from 172.16.20.2 on TenGigabitEthernet1/6, 00:36:01 ago
Routing Descriptor Blocks:
172.16.40.2, from 192.168.0.2, 00:36:01 ago, via TenGigabitEthernet2/6
Route metric is 2, traffic share count is 1
172.16.30.2, from 192.168.0.2, 00:36:01 ago, via TenGigabitEthernet2/5
Route metric is 2, traffic share count is 1
172.16.20.2, from 192.168.0.2, 00:36:01 ago, via TenGigabitEthernet1/6
Route metric is 2, traffic share count is 1
* 172.16.10.2, from 192.168.0.2, 00:36:01 ago, via TenGigabitEthernet1/5
Route metric is 2, traffic share count is 1
Sup2T# show ip cef exact-route 100.100.100.1 200.200.200.1
100.100.100.1 -> 200.200.200.1 => IP adj out of TenGigabitEthernet1/6, addr 172.16.20.2
Sup2T# show plat hardware cef exact-route 100.100.100.1 0 200.200.200.1 0
Interface: Te2/6, Next Hop: 172.16.40.2, ifnum: 0x12, Destination Mac: f866.f2d2.fa80
LIF: 0x20004013
Next hop used for SW based
CEF (SW forwarding data path)
Equal Cost Routes to
the destination prefix
Next hop used for HW based CEF (HW forwarding
path). Note: “0” is used for both src and dest L4
port numbers as test flow was ICMP echo
SW
HW
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
L3 Unicast Traffic Network Path Verification: Result
Po11
Ten1/4 Ten1/1
Ten1/2 Ten1/8
Ten2/1 Ten1/5
Ten2/2 Ten1/7
Po11
R1 DUT R2
Ten1/5 Ten1/3
Ten1/4 Ten1/6
Ten1/7 Ten2/5
Ten1/8 Ten2/6 Host 1
100.100.100.1 Host 2
200.200.200.1
Po11
Ten1/4 Ten1/1
Ten1/2 Ten1/8
Ten2/1 Ten1/5
Ten2/2 Ten1/7
Po11
R1 DUT R2
Ten1/5 Ten1/3
Ten1/4 Ten1/6
Ten1/7 Ten2/5
Ten1/8 Ten2/6 Host 1
100.100.100.1 Host 2
200.200.200.1
Each direction can use different links in the bundles !
35
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
What Did We Get from Path Verification?
• The physical links the specific traffic flow should come in and leave the DUT.
• Helps us to isolate if there is any faulty or oversubscribed interface.
• Caveats:
– Flapping links in port channel, can change the bundle hash mapping, and change physical path of traffic
– Clearing routes can as well change the order in which the L3 adjacencies get re-programmed, and in case of ECMP hence change the physical path of the traffic
– Any of these happen, you need to re-verify the path
36
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Troubleshooting Unicast Forwarding
• L2 Topology and Packet Flow
• L2 Packet Flow Troubleshooting L2 CAM, Interface counters/errors, Switch Fabric
• L3 Topology and Packet Flow
• L3 Packet Flow Troubleshooting FIB and Adjacency TCAM
Agenda
37
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Ten2/5
DFC4
WS-X6908
Module 2
Port
ASIC
Fabric
Interface &
Replication
Engine
MET
Port
ASIC
Switch Fabric
WS-X6908
Module 1
Port
ASIC
Fabric
Interface &
Replication
Engine
MET
Port
ASIC
DFC4
Layer 3/4
Engine
Layer 2
Engine
Layer 2
Engine
Layer 3/4
Engine
Ten1/1
Host1
Host2
Check the L3 / L4
forwarding engine
Ten1/2
Ten2/6
Detailed L3 Packet Flow Troubleshooting L3 FIB Table Programming Flow
38
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L3 Packet Flow Troubleshooting
• L3 forwarding tables get programmed by SW: copy of SW forwarding tables in HW
• EOBC is used for communication between modules and RP, and program L3 tables
L3/4 Engine in Detail: Counters and Tables
39
PFC4
CL1 TCAM
NetFlow
L3/4 Engine
L2 Engine
L2 CAM (128K)
ACE Counter
CL2 TCAM
FIB TCAM
ADJ TCAM
LIF Table
LIF Stats
RPF Table
LIF MAP
DBUS
RBUS EOBC
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
remote command module <mod> show ip cef
remote command module <mod> show adjacency detail
show ip arp
show ip cef adjacency
FIB / Adjacency Tables L3 FIB Table Programming Flow in Sup2T
40
show ip route (RIB) IOS® Routing Table (RP)
IOS FIB Table (RP)
IOS FIB Table (PFC/DFC)
FIB Table (PFC/DFC)
Verify Layer 3 rewrite Verify Layer 2 rewrite
IOS ARP Cache Table (RP)
IOS Adjacency Table (RP)
IOS Adjacency Table (PFC/DFC)
Adjacency Table (PFC/DFC)
show ip cef
show plat hard cef lookup <ip address> <mod>
show plat hard cef adjacency entry
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L3 Packet Flow Troubleshooting
Sup2T# show ip route 100.100.100.1
Routing entry for 100.100.100.1/32
Known via "ospf 100", distance 110, metric 2, type intra area
Last update from 192.168.40.2 on Vlan40, 00:00:19 ago
Routing Descriptor Blocks:
192.168.40.2, from 192.168.252.10, 00:00:19 ago, via Vlan40
Route metric is 2, traffic share count is 1
192.168.30.2, from 192.168.252.10, 00:00:19 ago, via Vlan30
Route metric is 2, traffic share count is 1
192.168.20.2, from 192.168.252.10, 00:00:19 ago, via Vlan20
Route metric is 2, traffic share count is 1
* 192.168.10.2, from 192.168.252.10, 00:00:29 ago, via Vlan10
Route metric is 2, traffic share count is 1
Verify IP Routing Table
41
Host 2
Host 1
Sup2T# show ip route 200.200.200.1
Routing entry for 200.200.200.1/32
Known via "ospf 100", distance 110, metric 2, type intra area
Last update from 172.16.30.2 on TenGigabitEthernet2/5, 00:01:00 ago
Routing Descriptor Blocks:
* 172.16.40.2, from 192.168.0.2, 00:01:10 ago, via TenGigabitEthernet2/6
Route metric is 2, traffic share count is 1
172.16.30.2, from 192.168.0.2, 00:01:00 ago, via TenGigabitEthernet2/5
Route metric is 2, traffic share count is 1
172.16.20.2, from 192.168.0.2, 00:01:00 ago, via TenGigabitEthernet1/6
Route metric is 2, traffic share count is 1
172.16.10.2, from 192.168.0.2, 00:01:00 ago, via TenGigabitEthernet1/5
Route metric is 2, traffic share count is 1
SW
SW
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L3 Packet Flow Troubleshooting L3 FIB Table and Counters
42
SW Sup2T# show ip cef 200.200.200.1
200.200.200.1/32
nexthop 172.16.10.2 TenGigabitEthernet1/5
nexthop 172.16.20.2 TenGigabitEthernet1/6
nexthop 172.16.30.2 TenGigabitEthernet2/5
nexthop 172.16.40.2 TenGigabitEthernet2/6
Sup2T# show ip cef exact-route 100.100.100.1 src-port 0 200.200.200.1 dest-port 0
100.100.100.1 -> 200.200.200.1 => IP adj out of TenGigabitEthernet1/6, addr 172.16.20.2
Sup2T# show ip cef adjacency tengig 1/6 172.16.20.2
172.16.20.2/32
attached to TenGigabitEthernet1/6
200.200.200.1/32
nexthop 172.16.20.2 TenGigabitEthernet1/6
IP CEF entries for destination IP addr
IP CEF Adjacency entries for next-hop IP address
IP CEF entries for destination IP address
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Sup2T# show platform hardware cef lookup 200.200.200.1
Codes: decap - Decapsulation, + - Push Label
Index Prefix Adjacency
8080 200.200.200.1/32 Te1/5 ,f866.f2d2.fa80 (Hash: 0001)
Te1/6 ,f866.f2d2.fa80 (Hash: 0002)
Te2/5 ,f866.f2d2.fa80 (Hash: 0004)
Te2/6 ,f866.f2d2.fa80 (Hash: 0008)
Sup2T# show platform hardware cef exact-route 100.100.100.1 0 200.200.200.1 0
Interface: Te2/6, Next Hop: 172.16.40.2, ifnum: 0x12, Destination Mac:
f866.f2d2.fa80 LIF: 0x20004013
Detailed L3 Packet Flow Troubleshooting L3 FIB Table and Counters
43
HW No more MLS for Sup2T engines. For Sup720, use “mls” instead of “platform hardware”.
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L3 Packet Flow Troubleshooting L3 FIB Table and Counters
44
Sup2T# show adjacency TenGigabitEthernet1/6 172.16.20.2 detail
Protocol Interface Address
IP TenGigabitEthernet1/6 172.16.20.2(14)
0 packets, 0 bytes
epoch 0
sourced in sev-epoch 0
Encap length 14
F866F2D2FA80B414896137800800
L2 destination address byte offset 0
L2 destination address byte length 6
Link-type after encap: ip
ARP
SW
HW Sup2T# show platform hardware cef ip 200.200.200.1 detail module 1
Codes: M - mask entry, V - value entry, A - adjacency index, NR- no_route bit
LS - load sharing count, RI - router_ip bit, DF: default bit
CP - copy_to_cpu bit, AS: dest_AS_number, DGTv - dgt_valid bit
DGT: dgt/others value
Format:IPV4 (valid class vpn prefix)
M(8080 ): 1 F 3FFF 255.255.255.255
V(8080 ): 1 0 0 200.200.200.1
(A:311296, LS:3, NR:0, RI:0, DF:0 CP:0 DGTv:1, DGT
Rewrite information (Dmac|Smac|0800): verify it is conform with next hop rewrite info
Ingress module, for the specific flow
Start adjacency pointer is 311296
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L3 Packet Flow Troubleshooting L3 FIB Table and Counters
45
Sup2T# show platform hardware cef adjacency entry 311296 detail module 1
Index: 311296 -- Valid entry (valid = 1) –
Adjacency fields:
___________________________________________________
|adj_stats = EN | fwd_stats = EN | trig = 0
|_________________|__________________|______________
|l3_enable = ON (classify as Layer3) | age = 3
|_________________|__________________|______________
|format = IP | rdt = ON | ignr_emut = 0
|_________________|__________________|______________
|vpn = 0x3FFF | elif = 0x400C | ri = 3
|_________________|__________________|______________
|top_sel = 0 | zone_enf = OFF | fltr_en = OFF
|_________________|__________________|______________
|frr_te = OFF | idx_sel = 0 | tnl_encap = 0
|_________________|__________________|______________
|rw_hint = 0 | ttl_control = 4 |
|_________________|__________________|______________
Format of the
packet sent out on the wire ...
HW
Checking the entry in the ingress module
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L3 Packet Flow Troubleshooting L3 FIB Table and Counters
46
Rewrite MAC info
RIT fields: The entry has a Layer2 Format
_________________________________________________________
|decr_ttl = YES | pipe_ttl = 0 | utos = 0
|_________________|__________________|____________________
|l2_fwd = 0 | rmac = 0 | ccc = L3_REWRITE
|_________________|__________________|____________________
|rm_null_lbl = YES| rm_last_lbl = YES| pv = 0
|_________________|__________________|____________________
|add_shim_hdr= NO | rec_findex = N/A | rec_shim_op = N/A
|_________________|__________________|____________________
|rec_dti_type = N/A | rec_data = N/A
|____________________________________|____________________
|modify_smac = YES| modify_dmac = YES| egress_mcast = NO
|____________________________________|____________________
|ip_to_mac = NO
|_________________________________________________________
|dest_mac = f866.f2d2.fa80 | src_mac = b414.8961.3780
|___________________________|_____________________________
|
Statistics: Packets = 0
Bytes = 0
Output Continued ….
Increases in the ingress DFC/PFC. Counters will be cleared when adjacency is read.
HW
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L3 Packet Flow Troubleshooting
• Default entry: 0.0.0.0/0 (“match all”)
• Always at bottom of FIB TCAM, if no default route, punt to drop adjacency.
• Drop adjacency (route to Null0): subject to rate limiter "ICMP UNREAC. NO-ROUTE"
L3 FIB Table Special Entries/Adjacencies
47
Sup2T# show ip route 0.0.0.0 0.0.0.0
% Network not in table
Sup2T# show plat hard cef lookup 123.0.1.1
Codes: decap - Decapsulation, + - Push Label
Index Prefix Adjacency
134368 0.0.0.0/0 drop
Sup2T# show plat hard cef lookup 123.0.1.1
Codes: decap - Decapsulation, + - Push Label
Index Prefix Adjacency
134368 0.0.0.0/0 Vl1200 ,0011.bc75.9c00
No default route present
Match-all entry links to drop adjacency, which is subject to rate limiter "ICMP UNREAC. NO-ROUTE“.
In-profile packets get punted to CPU … so possible reason for packets hitting CPU
After adding default route to Vlan 1200, adjacency points to next hop, all switched in HW
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Detailed L3 Packet Flow Troubleshooting
• FIB receive (local IP address): subject to rate limiter “CEF RECEIVE”
• Traffic matching CEF GLEAN entry is subject to rate-limiter “CEF GLEAN”
L3 FIB Table Special Entries/Adjacencies
48
Sup2T# show plat hard cef lookup 172.16.10.1
Index Prefix Adjacency
7701 172.16.10.1/32 receive
If not present, packets for local IP addresses don’t get to RP (SW)
Sup2T# show ip route 172.16.40.0
Routing entry for 172.16.40.0/24
Known via "connected", distance 0, metric 0
(connected, via interface)
Routing Descriptor Blocks:
* directly connected, via TenGigabitEthernet2/6
Route metric is 0, traffic share count is 1
Sup2T# show ip arp 172.16.40.2
Sup2T#
Sup2T# show plat hard cef look 172.16.40.2
Codes: decap - Decapsulation, + - Push Label
Index Prefix Adjacency
2128 172.16.40.0/24 glean
Known Subnet
If not present, packets to unresolved IP addresses for directly connected hosts/routers will not get punted to CPU (SW) to trigger ARP resolution
Unresolved ARP for directly connected host
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Troubleshooting Unicast Forwarding Summary
49
Determine path-of-the-packet through a
L2 and L3 network
L2 Forwarding
‒ Check MAC Learning
‒ L2 MAC tables are in sync (flooding)
‒ Interface Errors and Statistics
‒ Switch fabric path
L3 Forwarding
‒ SW and HW FIB entries
‒ Adjacency / Rewrite info
It is very critical to determine the flow
experiencing packet loss and find path-
of-the-packet through the network.
Knowledge of switch hardware and
software architecture expedites the
troubleshooting, and helps for timely
resolution of the problem.
L2 and L3 forwarding troubleshooting
for Catalyst 6800 is same as for Sup2T-
based Catalyst 6500.
Take Away Points
We can troubleshoot it !!
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Sup2T Unicast Forwarding
• IP Unicast Layer 3 Switching: http://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst6500/ios/151SY/config_guide/sup2T/15_1_sy_swcg_2T/cef.html
• Catalyst 6500 Switches ARP and CAM Table Issues Troubleshooting: http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-switches/71079-arp-cam-tableissues.html
• Catalyst 6500 Series Switches – Troubleshooting TechNotes: http://www.cisco.com/c/en/us/support/switches/catalyst-6500-series-switches/products-tech-notes-list.html
• Cisco Catalyst Instant Access – Q&A: http://www.cisco.com/c/en/us/products/collateral/switches/catalyst-6800ia-switch/qa_c67-728684.html
Reference Materials
50
Troubleshooting Multicast Forwarding
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Troubleshooting
• Terminology
• Multicast Replication and Modes
• Multicast Forwarding Troubleshooting
Agenda
52
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Troubleshooting
• Terminology
• Multicast Replication and Modes
• Multicast Forwarding Troubleshooting
Agenda
53
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Terminology
• OIF: Outgoing Interface
• OIL: Outgoing Interface List
• IGMP: Internet Group Management Protocol
• Multicast FIB: Contains the (*,G) and (S,G) entries as well as RPF-VLAN
• Adjacency Table: Contains the rewrite information and MET index
• LTL: Local Target Logic - Forwarding logic for the Catalyst® 6500 / 6800
• MET: Multicast Expansion Table - Hardware table that contains the OIFs for the (*,G) and (S,G) entries
54
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Local Target Logic (LTL)
• Every valid packet that ingresses the Catalyst 6500/6800 will be sent to a forwarding engine (FE) within the system (DFC or the PFC on the supervisor)
• The FE makes the decision about where to forward the packet or to drop the packet
• Part of the result of the forwarding decision is a destination LTL index (or destination index)
• The destination index is used to select the physical port(s) that will forward the packet
• For multicast, another important part of the forwarding decision is the MET index
55
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Expansion Table (MET)
56
• The MET is memory where the list of OIFs for the multicast entries are stored
• MET block contains the list of OIFs and the corresponding destination LTL index for each
• Each replication engine has a separate MET
• MET index from the CEF adjacency can be used to read the table
• MET tables are independent of the DFC. In other words, even CFC modules have MET tables
maps to port or set of ports
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Troubleshooting
• Terminology
• Multicast Replication and Modes
• Multicast Forwarding Troubleshooting
Agenda
57
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Replication
• Replication: Process of creating copies of packets
• L2 Replication: Creating copies of a packet within a single VLAN (e.g., Forwarding a single broadcast packet out all ports within a VLAN) – Does not require a replication engine
• L3 Replication: Creating copies of a multicast packet for forwarding out each of the interfaces in an OIL – Requires a replication engine
• For this multicast discussion, the term Replication will mean L3 Replication
58
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Ingress Replication Mode
• Replication engine on ingress module performs replication for all OIFs
• One copy of the original packet is forwarded across the fabric for each of the OIFs
• Input and replicated packets get lookup on PFC or ingress DFC
• Default to ingress mode when at least one module not capable of egress mode is present in the system
• MET’s on all replication engines are symmetric or synchronized
59
Switch
Fabric
Three Packets
Cross Fabric
RE
RE = Replication Engine
RE
RE
RE
1
2
3
4
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Egress Replication Mode
• Input packets get lookup on ingress DFC, replicated packets get lookup on egress DFC
• For OIFs on ingress module, local engine performs the replication
• For OIFs on other modules, ingress engine replicates a single copy of packet over fabric to all egress modules
• Engine on egress module performs replication for local OIFs
• MET tables on different modules can be asymmetric
60
Switch
Fabric
One Packet
Crosses Fabric
RE = Replication Engine
RE
RE
RE
RE
1
2
3
4
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Troubleshooting
• Terminology
• Multicast Replication and Modes
• Multicast Forwarding Troubleshooting
Agenda
61
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Diagram for Troubleshooting Example
Gi4/1
VLAN 30
Gi1/1
VLAN 10
Receiver
10.10.30.3
Source: 172.16.10.1
Group: 225.1.1.1
Receiver
10.10.20.3
62
Gi1/2
VLAN 20 Gi4/2
L3 Link Receiver
10.10.40.3
• DUT is a Catalyst 6500 with a Sup2T engine
• Module 1 and 4 are WS-X6824-SFP with DFC4-A.
• Server sending 225.1.1.1 stream, received on Gig1/1 in Vlan 10
• Receivers are connected to module 1 and 4, and in vlan 20, 30 and across an L3 link
DUT
Layer 3 Network
Router
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Replication Modes • In classic system (all modules are non-DFC), replication always occurs on the active supervisor engine
• In a fully fabric-enabled system, there are two possible replication modes: Ingress replication mode
Egress replication mode
Sup2T#show platform hardware capacity multicast
L3 Multicast Resources
Replication mode: egress
Bi-directional PIM Designated Forwarder Table Capacity: 8 Per Vrf
Bi-directional PIM Designated Forwarder Table usage:
Vrf IPV4 used IPV6 used Total used
Replication capability: Module Capability
1 egress
4 egress
6 egress
MET table Entries: Module Total Used %Used
1 65518 4 1%
4 65518 6 1%
6 32752 2 1%
Multicast LTL Resources
Usage: 38848 Total, 581 Used
63
Capabilities of each module in the system. One
card in the chassis only capable of ingress mode
cause the mode to move to ingress
Use show mls ip multicast
capability in older versions
Shows that the mode for
the system is Egress
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
“Router” port indicates that
the CPU is an mrouter port
IGMP Snooping
Use show ip igmp groups [group] to verify that the receivers’ membership reports are
received by the switch
Sup2T#sh ip igmp groups 225.1.1.1
IGMP Connected Group Membership
Group Address Interface Uptime Expires Last Reporter Group Accounted
225.1.1.1 Vlan30 01:44:42 00:02:24 10.10.30.5
225.1.1.1 Vlan20 01:44:42 00:02:17 10.10.20.5
225.1.1.1 GigabitEthernet4/2 01:48:46 00:02:13 10.10.40.3
Membership Reports and L2 Forwarding Table
64
Use show mac-address-table multicast igmp-snooping to display the
IGMP Snooping L2 forwarding table
Sup2T#sh mac address-table multicast igmp-snooping
vlan mac/ip address LTL ports
+----+-----------------------------------------+------+--------------
20 ( *,225.1.1.1) 0x912 Router Gi1/2
30 ( *,225.1.1.1) 0x914 Router Gi4/1
10 IPv4 OMF 0x90C Router
20 IPv4 OMF 0x90C Router
30 IPv4 OMF 0x90C Router
Gig1/2 and Gig4/1 are receivers
in vlan 20 and 30 respectively
shows ONLY
the last reporter
shows the
receivers in the
VLANs and L3
Interfaces
If a specific vlan is not
listed, then there is an
issue with IGMP
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
RPF neighbor
RPF VLAN
Multicast Forwarding
Sup2T#show ip mroute 225.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
<snip>
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
(*, 225.1.1.1), 02:02:40/stopped, RP 192.168.100.1, flags: SJC
Incoming interface: Null, RPF nbr 10.10.10.5
Outgoing interface list:
Vlan30, Forward/Sparse, 01:50:46/00:02:14
Vlan20, Forward/Sparse, 01:50:46/00:02:15
GigabitEthernet4/2, Forward/Sparse, 01:54:50/00:02:11
(172.16.10.1, 225.1.1.1), 01:32:44/00:02:09, flags: JT
Incoming interface: Vlan10, RPF nbr 10.10.10.5
Outgoing interface list:
GigabitEthernet4/2, Forward/Sparse, 01:32:44/00:02:11
Vlan20, Forward/Sparse, 01:32:44/00:02:15
Vlan30, Forward/Sparse, 01:32:44/00:02:14
(S,G) Entry in SW
65
(S,G)
OIL
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Make sure that drops are not
incrementing. If RPF drops are
seen, do show ip rpf <src-ip-
addr> to verify the RPF
information. Also, do show ip
route <src-ip-addr> to verify the
RPF interface for that multicast
stream
Multicast Forwarding
Sup2T#show ip mroute 225.1.1.1 count
<snip>
Forwarding Counts: Pkt Count/Pkts per second/Avg Pkt Size/Kilobits per second
Other counts: Total/RPF failed/Other drops(OIF-null, rate-limit etc)
Group: 225.1.1.1, Source count: 1, Packets forwarded: 720, Packets received: 720
RP-tree: Forwarding: 3/0/100/0, Other: 3/0/0
Source: 172.16.10.1/32, Forwarding: 717/0/100/0, Other: 717/0/0
Sup2T#show ip mfib 225.1.1.1 count
Forwarding Counts: Pkt Count/Pkts per second/Avg Pkt Size/Kilobits per second
Other counts: Total/RPF failed/Other drops(OIF-null, rate-limit etc)
<snip>
Group: 225.1.1.1
RP-tree,
SW Forwarding: 0/0/0/0, Other: 0/0/0
HW Forwarding: 3/0/100/0, Other: 0/0/0
Source: 172.16.10.1,
SW Forwarding: 0/0/0/0, Other: 0/0/0
HW Forwarding: 878/0/100/0, Other: 0/0/0
Totals - Source count: 1, Packet count: 881
Forwarded Multicast Packets
66
Make sure that forwarding packet counts are
incrementing (updated every 10 seconds)
This command is recommended for faster
response, in large-scale deployments.
Packets forwarded in
hardware vs. software
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Forwarding Entry Egress Mode
• The primary entry is used by the ingress forwarding engine for:
– Forwarding to all receivers and mrouters in the ingress VLAN
– Forwarding to all “local” receivers and mrouters on all OIFs in the OIL
– Forwarding a copy of the packet across the switching fabric to egress module(s)
• The secondary (or non-primary) entry is used by the egress forwarding engines for:
– Forwarding to all “local” receivers and mrouters on all OIFs in the OIL
Primary and Secondary Entries
67
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Forwarding Entry Egress Mode
Sup2T-dfc1#sh platform hardware multicast routing ip group 225.1.1.1 detail
IPv4 Multicast CEF Entries for VPN#0
<snip>
(172.16.10.1, 225.1.1.1/32)
FIBAddr: 0x40 IOSVPN: 0 RpfType: SglRpfChk SrcRpf: Vl10
CPx: 0 s_star_pri: 1 non-rpf drop: 0
PIAdjPtr: 0x38001 Format: IP rdt: off elif: 0xC5409
fltr_en: off idx_sel/bndl_en: 0 dec_ttl: on mtu_idx: 2(1518)
PV: 1 rwtype: MCAST_L3_RWT_L2_EXPS
met3: 0x34 met2: 0x28
Packets: 1393 Bytes: 139300
NPIAdjPtr: 0x38002 Format: IP rdt: on elif: 0xC5409
fltr_en: off idx_sel/bndl_en: 0 dec_ttl: off
PV: 0 rwtype: MCAST_L3_REWRITE
met3: 0x34 met2: 0x0 DestNdx: 0x7FF3
Packets: 0 Bytes:
Closer Look at the Primary Entry
68
(S,G) RPF VLAN
Primary Entry (PI)
Access the DFC using “remote
login module X” command.
met3: MET index used to retrieve the LTL
indices for receivers and mrouters local to
the ingress replication engine.
met2: MET index used to retrieve the LTL
index used to forward a single copy of the
multicast packet across the switching fabric.
Non-primary Entry (NPI)
Number of packets/bytes
forwarded using this entry.
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Forwarding Entry Egress Mode
69
Continued ……
NPIAdjPtr: 0x38002 Format: IP rdt: on elif: 0xC5409
fltr_en: off idx_sel/bndl_en: 0 dec_ttl: off
PV: 0 rwtype: MCAST_L3_REWRITE
met3: 0x34 met2: 0x0 DestNdx: 0x7FF3
Packets: 0 Bytes:
MET offset: 0x34
OIF AdjPtr Elif CR
+------+----------+--------+---+
Vl20 0x8014 0x14 1T1
MET offset: 0x28
OIF AdjPtr Elif CR
+---------+-------+---------+----+
EDT-34001 0x34001 0x8400A 1T1
Closer Look at the Primary Entry (continued)
met3 Index (from primary entry)
met2 Index (from primary entry)
Vlan 20 (receiver connected to Gig1/2)
For copy of the packet sent
via the switching fabric
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Forwarding Entry Egress Mode
Sup2T-dfc4#sh platform hardware multicast routing ip group 225.1.1.1 detail
IPv4 Multicast CEF Entries for VPN#0
<snip>
(10.10.10.5, 225.1.1.1/32)
FIBAddr: 0x0A IOSVPN: 0 RpfType: SglRpfChk SrcRpf: Vl10
CPx: 0 s_star_pri: 1 non-rpf drop: 0
PIAdjPtr: 0x54000 Format: IP rdt: off elif: 0xC5409
fltr_en: off idx_sel/bndl_en: 0 dec_ttl: on mtu_idx: 2(1518)
PV: 1 rwtype: MCAST_L3_RWT_L2_EXPS
met3: 0xA met2: 0x8
Packets: 0 Bytes: 0
NPIAdjPtr: 0x54001 Format: IP rdt: on elif: 0xC5409
fltr_en: off idx_sel/bndl_en: 0 dec_ttl: off
PV: 0 rwtype: MCAST_L3_REWRITE
met3: 0xA met2: 0x0 DestNdx: 0x7FF3
Packets: 1393 Bytes: 139300
Closer Look at the Secondary Entry
70
(S,G) RPF VLAN
Primary Entry (PI)
Access the DFC using “remote
login module X” command.
met3: MET index used to retrieve the LTL
indices for receivers and mrouters local to
the egress replication engine.
met2: MET index used to retrieve the LTL
index used to forward a single copy of the
multicast packet across the switching fabric.
Here, met2 = 0, because egress module will
NOT send anything back to fabric for this
specific (S,G) flow.
Non-primary Entry (NPI)
Number of packets/bytes
forwarded using this entry.
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Forwarding Entry Egress Mode
71
Continued ……
NPIAdjPtr: 0x54001 Format: IP rdt: on elif: 0xC5409
fltr_en: off idx_sel/bndl_en: 0 dec_ttl: off
PV: 0 rwtype: MCAST_L3_REWRITE
met3: 0xA met2: 0x0 DestNdx: 0x7FF3
Packets: 1393 Bytes: 139300
MET offset: 0xA
OIF AdjPtr Elif CR
+-------------+----------+-----------+------------+
Gig4/2 0x800C 0x408F 4/T1
Vl30 0x801E 0x1E 4/T1
MET offset: 0x8
OIF AdjPtr Elif CR
+-------------+----------+-----------+------------+
EDT-34005 0x5C000 0x840A 4/T
Found 2 entries.
Closer Look at the Secondary Entry (continued)
met3 Index
Receiver connected to Gig4/2
across an L3 interface, and a
receiver on Gig4/1 in vlan 30
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Hexadecimal 0x14 =
Vlan 20 in decimal.
Multicast Forwarding Entry Egress Mode
From commands executed in module 1 … (172.16.10.1, 225.1.1.1/32)
PIAdjPtr: 0x38001 Format: IP rdt: off elif: 0xC5409
met3: 0x34 met2: 0x28
Read MET table directly
Sup2T-dfc1#show platform hard met read slot 1 addr 34
Starting Offset: 0x0034
V E C:3969 I:0x00014 (A: 0x008014)
Sup2T-dfc1#sh platform hard met read slot 1 addr 28
Starting Offset: 0x0028
V E C:3974 I:0x04001 (A: 0x034001)
Read MET table directly
72
From commands executed in module 2 … (172.16.10.1, 225.1.1.1/32)
NPIAdjPtr: 0x54001 Format: IP rdt: off elif: 0xC5409
met3: 0x0A met2: 0x00 DestNdx: 0x7FF3
Read MET table directly
Sup2T-dfc2#show platform hard met read slot 2 addr 0A
Starting Offset: 0x000A
V C:3989 I:0x00000 (A: 0x0A8000)P->C
V E C:3969 I:0x0001E (A: 0x00801E)
Hexadecimal 0x1E =
Vlan 30 in decimal.
Entry used to send traffic
out on L3 port Gig4/2 Entry used to send traffic
across the fabric
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Egress Replication In Summary
73
Switch
Fabric RE
RE = Replication Engine
Module 1
(Vlan 10) Gi 1/1
(Vlan 20) Gi 1/2
Gi4/1 (vlan 30) MET
MET
Primary Entry
Module 4
Gi4/2 (L3 interface)
RE
Secondary Entry
MET = MET Table
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Forwarding – Resources
Sup2T#show platform hardware capacity multicast
L3 Multicast Resources
Replication mode: egress
Bi-directional PIM Designated Forwarder Table Capacity: 8 Per Vrf
Bi-directional PIM Designated Forwarder Table usage:
Vrf IPV4 used IPV6 used Total used
Replication capability: Mod Capability
1 egress
2 egress
5 egress
MET table Entries: Mod Total Used %Used
1 65518 2 1%
2 65518 6 1%
5 32744 2 1%
Multicast LTL Resources
Usage: 23488 Total, 12932 Used
Monitor resource usage
74
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Multicast Forwarding
Q1: Where should I do the troubleshooting commands discussed – DFC or PFC ?
A1: If the module (ingress/egress) has DFC, then the troubleshooting commands should be done there.
Q2: What happens when the modules are mixed (some has DFC and some don’t have DFC), in egress replication mode ?
A2: If the ingress module doesn’t have DFC, the replication occurs in that module while the forwarding lookup is done at the Active Sup Engine (PFC). Once the traffic reaches the egress module (which has DFC), the traffic replication and forwarding lookup are performed by that module itself.
Q3: Will there be any performance hit if modules do not have DFC ?
A3: Yes, the module(s) replicating the traffic will have to do lookup at the PFC which will take longer time compared to happening at the local DFC. This may result in multicast traffic drop at ingress or egress module, specifically during traffic oversubscription or burstiness.
Frequently Asked Questions
75
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Sup2T Multicast Forwarding
• IPv4 Multicast Layer 3 Switching: http://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst6500/ios/12-2SX/configuration/guide/book/mcastv4.html
• Catalyst 6500 Series Switches – Configuration and Troubleshooting Multicast: http://www.cisco.com/c/en/us/support/switches/catalyst-6500-series-switches/products-tech-notes-list.html#anchor16
• Cisco Catalyst Instant Access – Q&A: http://www.cisco.com/c/en/us/products/collateral/switches/catalyst-6800ia-switch/qa_c67-728684.html
Reference Materials
76
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Troubleshooting Multicast Forwarding Summary
77
Multicast terminologies
Multicast replication and modes
Multicast forwarding troubleshooting
‒ L2 programming
‒ L3 programming and statistics
Multicast hardware resource usage
It is recommended to have all modules
DFC enabled and replication mode is
set to egress, for best multicast
performance.
Keep an eye on the resource usage.
Multicast forwarding troubleshooting for
Catalyst 6800 is same as for Sup2T-
based Catalyst 6500s.
Take Away Points
We can troubleshoot it !!
Troubleshooting High CPU Utilization
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Troubleshooting High CPU Utilization
• High CPU Utilization – Due to interrupts – troubleshooting commands
– Monitoring hardware resource usage
• Control Plane Protection and Monitoring – Hardware Rate-Limiters (HWRL) and Control Plane Policing (CoPP)
– Flexible Netflow (FnF)
Agenda
79
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Troubleshooting High CPU Utilization
• High CPU Utilization – Due to interrupts – troubleshooting commands
– Monitoring hardware resource usage
• Control Plane Protection and Monitoring – Hardware Rate-Limiters (HWRL) and Control Plane Policing (CoPP)
– Flexible Netflow (FnF)
Agenda
80
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Troubleshooting High CPU Utilization
81
At what percentage level at should I start troubleshooting ?
It depends on the nature and level of the traffic. It is very essential to find a baseline CPU usage during normal working conditions, and start troubleshooting when it goes above specific threshold.
E.g., Baseline RP CPU usage 25%. Start troubleshooting when the RP CPU usage is consistently at 40% or above.
Why should I be concerned about high CPU usage ?
It is very important to protect the control-plane for network stability, as resources (CPU, Memory and buffer) are shared by control-plane and data-plane traffic (sent to CPU for further processing)
What are the usual symptoms of high CPU usage ?
• Control-plane instability e.g., OSPF flap, EIGRP flap
• Reduced switching / forwarding performance
• Slow response to Telnet / SSH
• SNMP poll miss
Frequently Asked Questions
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
High CPU Utilization
82
Investigate CPU utilization via “show proc cpu” and find if the usage is due to process and/or interrupt
Sup2T# show process cpu
CPU utilization for five seconds: 99%/90%; one minute: 9%; five minutes: 8%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
2 720 88 8181 9.12% 1.11% 0.23% 18 Virtual Exec
If CPU utilization is due to:
Process causes: recurring events, control-plane traffic etc.
Interrupts causes: inappropriate switching path, system running out of
hardware resources etc.
Total CPU usage (Process + Interrupt) CPU usage due to Interrupt
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
High CPU Utilization Due to processes …
High CPU due to ARP Input Process:
Caused by ARP flooding
Static route configured with interface instead of next-hop IP
address. This will generate ARP request for every packet that is
not reachable via more specific routes.
ip route 0.0.0.0 0.0.0.0 GigabitEthernet 2/5
High CPU due to IP Input Process:
Caused by traffic that needs to be process-switched or destined
to the CPU.
Most Common Reasons:
Broadcast storm
Traffic with IP-Options enabled
Traffic to which ICMP Redirect or Unreachable required
e.g., TTL=1, ACL Deny etc.
Traffic that needs further CPU processing e.g., ACL
Logging
High CPU due to BGP Scanner Process:
Walks the BGP table and confirms reachability of the next hops.
It also checks conditional-advertisement to determine whether
or not BGP should advertise condition prefixes, performs route
dampening. It is normal to see this process spiking up for short
duration, when the device carries huge internet routing table.
Excessive BGP Control traffic received by CPU.
High CPU due to SNMP Engine Process:
Due to aggressive polling of MIBs. “show snmp” provides
SNMP input and output stats.
High CPU due to Exec / Virtual Exec Process:
Caused by sending when too many messages to console / VTY
session(s)
Usually caused due to packet debugging and sending logs to
console / VTY session(s). Check “show debug” results and do
“undebug all” if necessary.
Are you running a “show tech” command and sending results to
a console / VTY session ?
83
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
High CPU Utilization
Control Plane (L2/L3) Protocols Control Plane Packet Forwarding
UDLD Protocol IP Options
PAgP Protocol Fragmentation
LACP Protocol Select Tunnel Options
SNMP Protocol ICMP Packets
Syslog Export MTU failure
Netflow & Netflow Data Export TTL=1 or TTL=0
Address Resolution Protocol (ARP) Packets with Checksum error or error length
HSRP, VRRP, GLBP RPF Check
Cisco Discovery Protocol (CDP) Packets that require ARP resolution
VLAN Trunking Protocol Non-IP (IPX, Appletalk)
Dynamic Trunking Protocol ACL logging
Telnet, IP Sec, SSH Broadcast traffic denied in RACL
BGP, OSPF, EIGRP, RIP, ISIS Authentication Proxy
Web Cache Control Protocol PBR traffic for certain “match” or “set” arguments
Protocols and Services Processed in Software
84
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
High CPU Utilization
Sup2T# show ip traffic
IP statistics:
Rcvd: 81676 total, 20945 local destination
0 format errors, 0 checksum errors, 41031 bad hop count
0 unknown protocol, 19609 not a gateway
0 security failures, 0 bad options, 120 with options
Frags: 0 reassembled, 0 timeouts, 0 couldn't reassemble
0 fragmented, 0 couldn't fragment
Bcast: 417 received, 0 sent
Mcast: 11423 received, 52655 sent
Sent: 61340 generated, 0 forwarded
Drop: 32 encapsulation failed, 0 unresolved, 0 no adjacency
ICMP statistics:
Rcvd: 0 format errors, 0 checksum errors, 17 redirects, 112 unreachable
812 echo, 812 echo reply, 0 mask requests, 0 mask replies, 0 quench
0 parameter, 0 timestamp, 0 info request, 0 other
ARP statistics:
Rcvd: 3518120 requests, 3636408 replies, 0 reverse, 0 other
<snip>
Traffic to CPU statistics
85
It also displays stats for : BGP, EIGRP,
TCP, UDP, PIM, IGMP and OSPF
Do “clear ip traffic” to reset
the counters, and monitor. TTL < 2 traffic
ARP sent and received
ICMP sent / received for various reasons
Broadcast traffic
Traffic with IP Options
Unresolved ARP
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
High CPU Utilization
Most of the times, packets punted to CPU has common factor(s).
• Packets received on the same vlan / interface or interfaces in the same module or same VRF etc.
• Packet have specific destination, or destined to prefixes learnt from a specific neighbor
• Anything else common ?
Or …..
• Has the system experienced any exception condition ?
• Does the system running out of hardware resources ?
Troubleshooting high CPU due to interrupts
86
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
High CPU Utilization
Sup2T# show ip cef switching statistics
Path Reason Drop Punt Punt2Host
<snip>
RP PAS Packet destined for us 0 25220386 0
RP PAS No adjacency 47 0 0
RP PAS Incomplete adjacency 427 0 0
RP PAS TTL expired 0 0 2
RP PAS IP options set 0 0 123735
RP PAS Routed to Null0 553005807 0 3661615
RP PAS Features 16395353 0 109755827
RP PAS Unclassified reason 51021 0 0
RP PAS Total 569452655 25220386 311875742
All Total 569452943 50440774 311875742
Due to Interrupts - switching path statistics
87
Packets dropped due to no hardware
adjacency found
Packets punted by the features enabled.
Sup2T# show ip cef switching statistics feature
IPv4 CEF input features:
Path Feature Drop Consume Punt Punt2Host Gave route
RP PAS uRPF 16395353 0 0 0 0
RP PAS WCCP 0 0 0 109755827 0
Total 16395353 0 0 109755827 0
<snip>
Unicast RPF and WCCP feature
punting the traffic
Supported in 12.2(33)SXH onwards. This command
has replaced “show cef not-cef-switched”.
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
High CPU Utilization
• NetDR is supported on Catalyst 6500 and Cisco 7600 platforms starting in 12.2(18)SXF
• Non-Intrusive Debug that can be used for troubleshooting high CPU
• Captures up to 4096 frames (wrap with continuous option)
Due to Interrupts - NetDR
88
Direction
From CPU’s Perspective
• Receive (Rx)
• Transmit (Tx)
• Both
Filters
• Interface
• Source/Destination Index
• Ingress VLAN
• Ethertype
• Source/Destination MAC
• Source/Destination IP Address
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
High CPU Utilization NetDR – Example IPv4
89
Sup2T# debug netdr capture rx
Sup2T# show netdr captured-packets
A total of 111 packets have been captured
The capture buffer wrapped 0 times
Total capture capacity: 4096 packets
------- dump of incoming inband packet -------
l2idb Gi6/3, l3idb Vl576, routine inband_process_rx_packet, timestamp 21:33:37.779
dbus info: src_vlan 0x240(576), src_indx 0x142(322), len 0x82(130)
bpdu 0, index_dir 0, flood 0, dont_lrn 0, dest_indx 0x7FA3(32675)
cap1 0, cap2 0
D0020900 02400400 01420000 82000000 1E000424 26000004 00000000 7FA3FCBB
destmac B4.14.89.61.37.80, srcmac 08.D0.9F.E3.6D.C2, shim ethertype CCF0
earl 8 shim header IS present:
version 0, control 64(0x40), lif 576(0x240), mark_enable 1,
feature_index 0, group_id 0(0x0), acos 0(0x0), ttl 14,
dti 4, dti_value 0(0x0)
ethertype 0800
protocol ip: version 0x04, hlen 0x05, tos 0x00, totlen 100, identifier 5
df 0, mf 0, fo 0, ttl 255, src 14.2.36.1, dst 14.2.36.11
icmp type 8, code 0
“rx” captures traffic received by the CPU
Maximum capacity is 4096 packets. Currently, 111 packets are
captured.
Source vlan of the incoming traffic.
Source and Destination IP address
Source and Destination MAC address
Ingress interface
Ethertype 0800 = IPv4 packet
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
High CPU Utilization NetDR – Example ARP
90
------- dump of incoming inband packet -------
l2idb Gi6/3, l3idb Vl576, routine inband_process_rx_packet, timestamp 21:21:46.407
dbus info: src_vlan 0x240(576), src_indx 0x142(322), len 0x40(64)
bpdu 0, index_dir 0, flood 0, dont_lrn 0, dest_indx 0x7FF3(32755)
cap1 1, cap2 0
F0020100 02400400 01420000 40000000 E0000052 86000004 00000000 7FF36ACC
destmac B4.14.89.61.37.80, srcmac D8.67.D9.0B.BF.3E, ethertype 0806
layer 3 data: 00010800 06040002 D867D90B BF3E0E02 243BB414 89613780
0E02240B 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
Ingress interface Source VLAN
Source and Destination MAC
addressees
Ethertype 0806 = ARP
OpCode 2 = ARP Reply
Sender’s MAC Sender’s IP Destination
MAC Destination IP
How am I supposed to find the top-talker(s) from 4096 packets ? Is there a simpler way other than manually doing it ?
Yes. Visit http://netdr.54.198.170.81.xip.io/
This tool can be accessed at Cisco Support Tools page: http://www.cisco.com/c/en/us/support/web/tsd-most-requested-
tools.html
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
High CPU Utilization Packet Capture Tools supported in Catalyst 6500
91
Tool Support Capability / Comments
SPAN / RSPAN / ERSPAN Sup720 and Sup2T Captures traffic received on any port and replicate it to destination
port, vlan or IP address (after encapsulation)
Mini-Protocol Analyzer (MPA)
Sup720 and Sup2T Captures traffic from, to and through the switch. Only one session
allowed at any given time.
In-Band SPAN Not supported in Sup2T Captures traffic going to CPU via inband
Flexible Netflow (FnF) Not supported in Sup720 Captures traffic from/to the CPU / control-plane
ELAM Sup720 and Sup2T Captures ONLY ONE packet from/to the CPU or through the switch.
Provides information on forwarding decision.
NetDR Sup720 and Sup2T Captures process-switches and interrupt-switches (Software CEF)
traffic in the inband channel.
Relevant Session at CiscoLive 2014:
BRKARC-2011: Overview of Packet Capturing Tools in Cisco Switches and Routers
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Verify if CEF is enabled on all interfaces
High CPU Utilization
Sup2T# show cef state
CEF Status:
RP instance
common CEF enabled
IPv4 CEF Status:
CEF enabled/running
dCEF enabled/running
CEF switching enabled/running
universal per-destination load sharing algorithm, id C3097102
IPv6 CEF Status:
CEF disabled/not running
dCEF disabled/not running
universal per-destination load sharing algorithm, id C3097102
RRP state:
I am standby RRP: no
RF Peer Presence: yes
RF PeerComm reached: yes
<snip>
Due to Interrupts – CEF Status
92
Verify if CEF / dCEF is enabled and running globally
Sup2T# show ip interfaces | inc line pro|CEF
<snip>
Vlan10 is up, line protocol is up
IP CEF switching is enabled
IP CEF switching turbo vector
IP route-cache flags are Fast, CEF
Vlan20 is up, line protocol is up
IP CEF switching is enabled
IP CEF switching turbo vector
IP route-cache flags are Fast, CEF
Vlan30 is up, line protocol is up
IP CEF switching is enabled
IP CEF switching turbo vector
IP route-cache flags are Fast, CEF
<snip>
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
PFC mode in a catalyst 6500 switch with a Sup2T engine. PFC mode determines the maximum hardware resource supported.
High CPU Utilization
Sup2T# show platform hardware pfc mode
PFC operating mode : PFC4XL
Sup2T# show platform hardware cef maximum-routes
Fib-size: 1024k (1048576), shared-size: 1016k (1040384),
shared-usage: 0k(0)
Protocol Max-routes Use-shared-region
-------- ---------- ----------------- ---------
IPV4 1017k Yes 1k
IPV4-MCAST 1017k Yes 1k
IPV6 1017k Yes 1k
IPV6-MCAST 1017k Yes 1k
MPLS 1017k Yes 1k
EoMPLS 1017k Yes 1k
VPLS-IPV4-MCAST 1017k Yes 1k
VPLS-IPV6-MCAST 1017k Yes 1k
Sup2T# show platform hardware cef exception status
Current IPv4 FIB exception state = FALSE
Current IPv6 FIB exception state = FALSE
Current MPLS FIB exception state = FALSE
Current EoM/VPLS FIB TCAM exception state = FALSE
Due to Interrupts – FIB Status
93
Sup2T# sh plat hardware cef summary
Total routes: 77
IPv4 unicast routes: 66
IPv4 non-vrf routes: 50
IPv4 vrf routes: 16
IPv4 multicast routes: 6
IPv6 unicast routes: 2
IPv6 global routes: 2
IPv6 non-vrf routes: 2
IPv6 vrf routes: 0
IPv6 link-local routes: 0
IPv6 multicast routes: 1
mpls routes: 1
mpls-vpn routes: 0
eompls-l2 routes: 1
eom-ipv4-mcast routes: 0
eom-ipv6-mcast routes: 0
Number of routes installed in the hardware currently. Sup2T supports fine-tuning the FIB entries allocated to
each protocol by using command:
Sup2T(config)# plat hard cef max <protocol> <value>
Check CEF exception status
Maximum number of routes supported for each type of routes like IPv4, IPv6 etc.
Dedicated
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
High CPU Utilization
Sup2T# show platform hardware capacity forwarding
L2 Forwarding Resources
MAC Table usage: Module Collisions Total Used %Used
1 0 131072 55 1%
2 0 131072 55 1%
5 0 131072 55 1%
6 0 131072 53 1%
L3 Forwarding Resources
FIB TCAM usage: Total Used %Used
72 bits (IPv4, MPLS, EoM) 1048576 68 1%
144 bits (IP mcast, IPv6) 524288 8 1%
288 bits (IPv6 mcast) 262144 1 1%
detail: Protocol Used %Used
IPv4 66 1%
MPLS 1 1%
EoM 1 1%
IPv6 2 1%
IPv4 mcast 6 1%
IPv6 mcast 1 1%
Adjacency usage: Total Used %Used
1048576 32029 3%
( continued … )
Monitoring resources usage
94
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
High CPU Utilization
( Continued ... )
Forwarding engine load:
Module pps peak-pps peak-time
1 2 1374 11:40:31 EST Fri Mar 23 2012
2 0 408 17:34:47 EST Thu Mar 22 2012
5 52 666 11:40:31 EST Fri Mar 23 2012
6 0 25 12:11:11 EST Fri Mar 23 2012
Sup2T# show platform hardware capacity ?
acl Show QoS/Security ACL capacity
cpu Show CPU resources capacity
fabric Show Fabric resources capacity
forwarding Show forwarding engine capacity
monitor Show SPAN resources capacity
multicast Show L3 and LTL Multicast resources
netflow Show Netflow capacity
pfc Show PFC resources capacity
power Show Power resources capacity
qos Show QoS resources capacity
rate-limit Show CPU Rate Limiters capacity
rewrite-engine Show rewrite-engine capacity
vlan Show VLAN resources capacity
Monitoring resources usage
95
Note: Not all available options are shown here
Usage of ACL TCAM
Usage of switching fabric
Usage of Netflow TCAM
Usage of QoS TCAM
Usage of hardware rate-limiters
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
DBUS
PFC4
1GE / 10GE
Uplinks
Traces # 1 to 24 RBUS EOBC
MSFC 5
Flash DRAM
CL1 TCAM
NetFlow
L3/4 Engine
L2 Engine
L2 CAM (128K)
ACE Counter
CL2 TCAM
40 Gbps
Fabric Interface
&
Replication Engine Switch Fabric
26 x 40G Traces
2 Gbps Central
Management
Processor
CPU
Port ASIC
MET
FIB TCAM
ADJ TCAM
LIF Table
LIF Stats
RPF Table
LIF MAP
High CPU Utilization Commands to set baseline resources usage
96
show ip traffic
show interfaces
show proc cpu
show ibc
sh plat hard
capacity fabric
1. sh plat hard
capacity forward
2. sh plat hard
capacity acl
3. sh plat hard
capacity qos
4. sh plat hard
capacity netflow
5. sh plat hard
capacity rate-limit
1. sh plat hard capacity monitor
2. sh plat hard capacity rewrite
sh plat hard cap multicast
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Troubleshooting High CPU Utilization
• High CPU Utilization – Due to interrupts – troubleshooting commands
– Monitoring hardware resource usage
• Control Plane Protection and Monitoring – Hardware Rate-Limiters (HWRL) and Control Plane Policing (CoPP)
– Flexible Netflow (FnF)
Agenda
97
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Control Plane Protection Hardware Rate-Limiters
98
Switch
Fabric
Hardware
Rate-
Limiter
1000 pps
PFC
Supervisor
Hardware
Rate-
Limiter
1000 pps
DFC
Linecard
Hardware
Rate-
Limiter
1000 pps
DFC
Linecard
CPU
15,000pps
20,000pps
8,000pps
3,000 pps
Hardware Rate Limiter Policies are enforced
independently within each Forwarding Engine
The aggregate traffic from all of the
Forwarding Engines will reach the CPUs.
Where N= # of Forwarding Engines
(Aggregate Traffic = N x Traffic Policy
pps)
In this example HWRL Policy = 1000 pps
An hardware rate-limiter
configured to permit 1000 pps
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Control Plane Protection Hardware Rate-Limiters
99
Sup2T# show platform rate-limit
State : ON - enabled but not sharing, ON/S - enabled and sharing
Share : NS - not sharing, G - group, S - static sharing, D - dynamic sharing
: P/sec - Packets/sec, B/sec - Bytes/second, BP - Burst period (microsec)
Rate Limiter Type State P/sec P/burst B/sec B/burst BP Share Leak
--------------------- ----- -------- -------- ---------- ---------- ------- ------- ----
CEF RECEIVE OFF - - - - - - -
CEF GLEAN ON 1000 - - - 1000000 NS OFF
IP ERRORS OFF - - - - - - -
UCAST IP OPTION ON 1000 - - - 100 G: 0, S ON
ICMP ACL-DROP ON 1000 - - - 100 G: 0, S ON
ICMP NO-ROUTE ON 100 - - - 1000000 NS OFF
ICMP REDIRECT OFF - - - - - - -
TTL FAILURE OFF - - - - - - -
Traffic with TTL=1 are not
rate-limited
Traffic hitting ACL deny entry are rate-limited
Rate-miters are implemented in hardware, to reduce flow of excess traffic to CPU
Sup2T(config)#platform rate-limit ?
all Rate Limiting for both Unicast and Multicast packets
layer2 layer2 protocol cases
multicast Rate limiting for Multicast packets
unicast Rate limiting for Unicast packets
En/Disable and fine-tune hardware rate-limiters.
Use “mls rate-limit” for Sup720 engines.
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Control Plane Protection
Unicast Rate Limiters
CEF Receive Traffic Destined to the Router
CEF Glean ARP Packets
CEF No Route Packets with Not Route in the FIB
ICMP Redirect Packets that Require ICMP Redirects
IP Errors Packet with IP Checksum or Length Errors
IP Features Security Features (Auth-Proxy,IP Sec, others)
ICMP No Route ICMP Unreachables for Unroutable Packets
ICMP ACL Drop ICMP Uncreachables for Admin Deny Packets
RPF Failure Packets that Fail uRPF Check
L3 Security CBAC, Auth-Proxy, and IPSEC Traffic
ACL Input NAT, TCP Int, Reflexive ACLs, Log on ACLs
ACL Output NAT, TCP Int, Reflexive ACLs, Log on ACLs
VACL Logging CLI Notification of VACL Denied Packets
IP Options Unicast Traffic with IP Options Set
Capture Used with Optimized ACL Logging
Hardware Rate Limiters Support
100
Multicast Rate Limiters
Multicast FIB-Miss Packets with No mroute in the FIB
Partial Shortcut Partial Shortcut Entries
Directly Connected Local Multicast on Connected Interface
IP Options Multicast Traffic with IP Options Set
V6 Directly Connect Packets with No Mroute in the FIB
V6*, G M Bridge IGMP Packets
V6*, G Bridge Partial Shortcut Entries
V6 S, G Bridge Partial Shortcut Entries
V6 Route Control Partial Shortcut Entries
V6 Default Route Multicast Traffic with IP Options Set
V6 Second Drop Multicast Traffic with IP Options Set
Layer 2 Rate Limiters
L2PT L2PT Encapsulation / Decapsulation
PDU Layer 2 PDUs
IGMP IGMP Packets
General Rate Limiters
MTU Failure Packets Requiring Fragmentation
TTL Failure Packets with TTL<=1
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Control Plane Protection Control Plane Policing
101
Switch
Fabric
Hardware
CoPP
10 kbps
PFC
Supervisor
Hardware
CoPP
10 kbps
DFC
Linecard
Hardware
CoPP
10 kbps
DFC
Linecard
100 kbps
250 kbps
60 kbps
(10 kbps * 3) 30 kbps
Control Plane Interface Polices are configured and
enforced on each Forwarding Engine and also in the
software interface as one final aggregate policing policy
An aggregate from all of the Forwarding
Engines will reach the CPUs.
A Final Software-based policer is applied
at the logical interface applied to the
individual CPU
Software
CoPP
10 kbps CPU
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Control Plane Protection Control Plane Policing Deployment
102
class-map match-all copp-bgp
match access-group name coppacl-bgp
class-map match-all copp-igp
match access-group name coppacl-igp
class-map match-all copp-management
match access-group name coppacl-management
class-map match-all copp-reporting
match access-group name coppacl-reporting
class-map match-all copp-monitoring
match access-group name coppacl-monitoring
class-map match-all copp-critical-app
match access-group name coppacl-critical-app
class-map match-all copp-undesirable
match access-group name coppacl-undesirable
control-plane
service-policy input copp-policy
Step 2: Associate each
class of traffic to a class-map
policy-map copp-policy
class copp-bgp
police 30000000 conform-action transmit exceed-action drop
class copp-igp
police 30000000 conform-action transmit exceed-action drop
class copp-management
police 30000000 conform-action transmit exceed-action drop
class copp-reporting
police 30000000 conform-action transmit exceed-action drop
class copp-monitoring
police 30000000 conform-action transmit exceed-action drop
class copp-critical-app
police 30000000 conform-action transmit exceed-action drop
class copp-undesirable
police 30000000 conform-action transmit exceed-action drop
class class-default
police 30000000 conform-action transmit exceed-action drop
Step 3: Apply a policing action for each class. Switch will
ignore a class that does not have a corresponding action.
If both conform-action and exceed-action are set to transmit,
it will allocate a default policer as opposed to a dedicated
policer with its own hardware counters.
Step 1: Identify the interesting traffic
and classify with ACLs (with permits)
Step 4: Apply the policy-map
under control-plane interface
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Control Plane Protection
• A mechanism to protect the CPU from oversubscription
• Allow more granular control and monitoring - compared to the Hardware Rate-Limiters (HWRLs).
Things to remember:
• “mls qos” should be enabled for Hardware CoPP - This enables port-level QoS mechanisms.
• Hardware CoPP will ignore a class that does not have a corresponding policing action.
• Hardware CoPP decisions are per forwarding engine (FE). SW CoPP for the aggregate traffic
• Hardware CoPP does not support IP/ARP broadcast/multicast traffic
– Use multicast HWRL, Dynamic ARP Inspection (DAI), “mls qos protocol <options>” or multicast/broadcast Storm-Control feature (per-port based).
Control Plane Policing - Summary
103
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Control Plane Protection Support for Distributed Aggregate Policers in Sup2T
104
Switch
Fabric
Hardware
CoPP
10 kbps
PFC
Supervisor
Hardware
CoPP
10 kbps
DFC
Linecard
Hardware
CoPP
10 kbps
DFC
Linecard
100 kbps
250 kbps
60 kbps
10 kbps
Software
CoPP
10 kbps CPU
Control Plane Policy can be implemented as a distributed
Policer. The Forwarding Engines are synchronized via Policer
Update Packets to maintain the aggregate traffic rate for a
given policy.
With distributed Policing, polices the aggregate from all
the forwarding engines is the configured rate or less.
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Hardware Rate-Limiters
Traffic to CPU
Special Cases
Matches Policy
Hardware CoPP
CPU Software
CoPP
Control Plane Protection CoPP and Hardware Rate Limiters
105
Few Considerations:
When a packet matches both HW CoPP and HWRL, the packet undergoes HWRL policy and skips HW
CoPP. In essence, HWRL overrides HW CoPP.
Configure the CEF receive rate limiter with caution…
Given that the CEF receive rate-limiter matches all traffic destined to the Route Process (“good”
frames and “bad” frames) and takes precedence over CoPP, it is best to only use CoPP instead.
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Fixed 7 Keys
NetFlow Cache
Export
Aggregation Cache
Export
Packet
Flexible NetFlow v5 or v9 on SUP2T
Flow Monitor 1
Aggregation Without Cache Using Key Fields
Export
Export
Destination 1
Destination 2
Src If Src IP Addr Dst If Dst IP
Addr
Proto Src
Port
Dst
Port
Gi 1/1 173.100.21.2 Gi1/2 10.0.227.3 11 00A2 00A2
Gi 1/1 173.100.2.2 Gi1/2 10.0.227.4 6 15 15
Protocol Pkts Src Port Dst Port
11 11000 00A2 00A2
Expiration Timers
To Collector
Destination Prefix Proto IP Precedence
10.0.227.3 11 80
10.0.227.4 6 40
Flow Monitor 2
Payload
(flow) Hea
de
r
Expiration Timers
Payload
(flow) He
ad
er
Expiration Timers
Source Prefix Source Mask Proto ToS
10.0.227.3 255.252.0.0 11 80
10.0.227.4 255.255.0.0 6 40
Traditional NetFlow v5 or v9 on SUP720
Export
Packet Payload
(flow) Hea
de
r
Flexible NetFlow Sup720 vs. Sup2T
106
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Flexible Netflow Configuration Steps
107
flow record SAMPLE-FLOW
match ipv4 source address
match ipv4 destination address
match transport source-port
match transport destination-port
match flow direction
collect counter bytes
collect counter packets
collect timestamp sys-uptime first
collect timestamp sys-uptime last
flow exporter SAMPLE-EXPORT-1
description SAMPLE FnF v9 Exporter
destination 11.1.1.1 vrf MGMT
source Loopback0
transport udp 999
flow exporter SAMPLE-EXPORT-2
description SAMPLE FnF v9 Exporter
destination 12.1.1.1 vrf MGMT
transport udp 999
flow monitor SAMPLE-MONITOR
description SAMPLE FnF v9 Monitor
record SAMPLE-FLOW
exporter SAMPLE-EXPORT-1
exporter SAMPLE-EXPORT-2
interface GigabitEthernet1/1/1
ip address 172.16.0.1 255.255.255.0
ip flow monitor SAMPLE-MONITOR input
ip flow monitor SAMPLE-MONITOR output
logging event link-status
interface Vlan10
ip address 172.16.1.1 255.255.0
ip flow monitor SAMPLE-MONITOR input
ip flow monitor SAMPLE-MONITOR output
logging event link-status
NON-KEY
KEY
Interfaces support multiple
monitors if their key fields do
not overlap
Steps:
1. Create Flow Record
2. Create Flow Exporter
3. Associate Record
and Exporter to a
Flow Monitor
4. Apply to the
interfaces
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Flexible Netflow Monitoring Control-Plane traffic
108
Sup2T# show process cpu
CPU utilization for five seconds: 65%/8%; one minute: 63%; five minutes: 61%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
310 30544 189234 81 47.12% 45.11% 45.23% 0 IP Input
Sup2T(config)#flow RECORD copp-fnf-cef-receive-rec
Sup2T(config-flow-record)#match ipv4 protocol
Sup2T(config-flow-record)#match ipv4 source address
Sup2T(config-flow-record)#match ipv4 destination address
Sup2T(config-flow-record)#match transport source-port
Sup2T(config-flow-record)#match transport destination-port
Sup2T(config-flow-record)#collect interface input
Sup2T(config-flow-record)#collect counter packets
Sup2T(config-flow-record)#exit
Sup2T(config)#flow MONITOR copp-fnf-cef-receive
Sup2T(config-flow-monitor)#record copp-fnf-cef-receive-rec
Sup2T(config-flow-monitor)#exit
Sup2T(config)#control-plane
Sup2T(config-cp)#ip flow monitor copp-fnf-cef-receive input
Sup2T(config-cp)#exit
High CPU due to process “IP Input”
Building a FnF record, matching L3 and L4
parameters (key fields) and collecting
details on Input interface and packet count
(non-key fields)
Associating the FnF record to a
monitor. Here, there is an option
(not enabled here) to export the
data to the collector
Applying to the control-
plane interface
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Flexible Netflow
Sup2T# show flow monitor copp-fnf-cef-receive cache sort counter packet
Processed 5 flows
Aggregated to 5 flows
Showing the top 5 flows
IPV4 SOURCE ADDRESS: 192.168.40.5
IPV4 DESTINATION ADDRESS: 192.168.40.1
TRNS SOURCE PORT: 48827
TRNS DESTINATION PORT: 2413
IP PROTOCOL: 17
interface input: Vl40
counter packets: 460983
<snip>
After several seconds…
Sup2T# sh flow mon copp-fnf-cef-receive ca sort count pack
<snip>
IPV4 SOURCE ADDRESS: 192.168.40.5
IPV4 DESTINATION ADDRESS: 192.168.40.1
TRNS SOURCE PORT: 48827
TRNS DESTINATION PORT: 2413
IP PROTOCOL: 17
interface input: Vl40
counter packets: 461181
<snip>
Monitoring Control-Plane traffic
109
First flow with high number
of packets hitting the CPU
Results sorted according to
the number of packets per flow
Clear the statistics of the FnF using the command:
Sup2T# clear flow monitor ?
copp-fnf-cef-receive User defined
name Name a specific Flow Monitor
<cr>
Sup2T# clear flow monitor copp-fnf-cef-receive ?
cache Flow Monitor cache information
force-export Export the contents of the cache
statistics Flow Monitor cache statistics
<cr>
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Flexible Netflow
Sup2T(config)#ip access-list extended UDP2413
Sup2T(config-ext-nacl)#permit udp host 192.168.40.5 host 192.168.40.1 eq 2413
Sup2T(config)#class-map TEST
Sup2T(config-cmap)#match access-group name UDP2413
Sup2T(config)#policy-map policy-default-autocopp
Sup2T(config-pmap)#class TEST
Sup2T(config-pmap-c)#police rate 50 pps burst 10 packets
Rate-limiting the traffic causing high CPU
110
The default CoPP applied to
the control-plane interface
Sup2T# show process cpu
CPU utilization for five seconds: 10%/8%;
<snip>
Sup2T# show policy-map control-plane input class TEST
Control Plane Interface
Service-policy input: policy-default-autocopp
Hardware Counters:
class-map: TEST (match-all)
<snip>counter) a
Earl in Slot 1: <snip>
Earl in Slot 2: <snip>
Software Counters:
<snip>
Hardware (per EARL) aggregate
counters and Software counters
CPU usage went down
after applying the policer
Once the flow is
identified, further action
could be (1) blocking
the flow with an Access
List (ACL) or (2) rate-
limiting it using Control
Plane Policing (CoPP)
depending on the
criticality of the flow.
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Protecting Control Plane
• Protecting Cisco Catalyst 6500 Series Switches Using Control Plane Policing, Hardware Rate Limiting, and Access-Control Lists http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/white_paper_c11_553261.html
• Protecting the Cisco Catalyst 6500 Series Switches Against Denial-Of-Service Attacks http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/prod_white_paper0900aecd802ca5d6.html
• Troubleshooting tools to analyze high CPU utilization issues on Catalyst 6500 Series switches https://supportforums.cisco.com/docs/DOC-22037
• Control Plane Policing Implementation Best Practices (general and platform specific) http://www.cisco.com/web/about/security/intelligence/coppwp_gs.html
• Borderless Networks Security: Catalyst 6500 Control plane Protection Techniques for Maximum Uptime http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/white_paper_c11-663623.html
• Cisco Catalyst 6500 Supervisor Engine 2T: NetFlow Enhancements http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/white_paper_c11-652021.html
References
111
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
High CPU Utilization Summary
112
Some of the causes for high CPU
utilization due to interrupts
Commands to troubleshoot, and using
NetDR
Commands used to set baseline
hardware resource usage
Using Hardware Rate-Limiter, Control-
Plane Policing (CoPP) and Flexible
Netflow (FnF) to protect and monitor
control-plane
It is very important to set baseline CPU
usage and traffic level in the inband
channel (if possible, per protocol/flow),
under normal working conditions, and
find deviation when CPU usage spikes.
It is critical to protect CPU, for a stable
network.
Closely monitor and justify usage of all
the hardware resources.
Take Away Points
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Troubleshooting Catalyst 6500/6800 Switches Final Message
113
Please practice and get familiar with the troubleshooting techniques.
If you don’t use it, you lose it.
Catalyst 6500/6800 is thriving ….. and …..
Innovation Continues !!!
We can troubleshoot it !!
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Recommended Sessions @ CiscoLive 2014
114
• BRKCRS-3148 – Advanced Catalyst 6500 / 6800 Series Troubleshooting
• BRKCRS-3035 – Advanced Enterprise Campus Design: Virtual Switching System (VSS)
• BRKCRS-3036 – Advanced Enterprise Campus Design: Routed Access
• BRKCRS-3502 - Advanced Enterprise Campus Design: Instant Access
• BRKCRS-2501 – Campus QoS Design-Simplified
• BRKARC-3465 – Cisco Catalyst 6800 Switch Architecture
• BRKDCT-2333 – Data Center Network Failure Detection
• TECCRS-2932 – Campus LAN Switching Architecture
• LTRCRS-2004 – Cisco Catalyst Instant Access - Virtual Switching System (IA-VSS) Lab
• TECCRS-2001 – Enterprise High Availability Design and Architecture
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Join Cisco Support Communities!
• Free for anyone with Cisco.com registration
• Get timely answers to your technical questions
• Find relevant technical documentation
• Engage with over 200,000 top technical experts
• Seamless transition from discussion to TAC Service Request (Cisco customers and partners only)
supportforums.cisco.com
supportforums.cisco.mobi
The Cisco Support
Community is your one-
stop community
destination from Cisco
for sharing current, real-
world technical support
knowledge with peers
and experts.
Documents
Discussions
Blogs
Video Ask the Expert
Mobile
Sample Cat6500 docs:
Troubleshooting High CPU in Catalyst 6500:
https://supportforums.cisco.com/docs/DOC-15602
ACL TCAMs and LoUs in Catalyst 6500:
https://supportforums.cisco.com/docs/DOC-16384
Troubleshooting with NETDR in Catalyst 6500 with Sup720:
https://supportforums.cisco.com/docs/DOC-15608
Troubleshooting tools to analyze high CPU utilization on Cat6500:
https://supportforums.cisco.com/docs/DOC-22037
115
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Participate in the “My Favorite Speaker” Contest
• Promote your favorite speaker through Twitter and you could win $200 of Cisco Press products (@CiscoPress)
• Send a tweet and include
– Your favorite speaker’s Twitter handle @YogiCisco
– Two hashtags: #CLUS #MyFavoriteSpeaker
• You can submit an entry for more than one of your “favorite” speakers
• Don’t forget to follow @CiscoLive and @CiscoPress
• View the official rules at http://bit.ly/CLUSwin
Promote Your Favorite Speaker and You Could be a Winner
116
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Complete Your Online Session Evaluation
• Give us your feedback and you could win fabulous prizes. Winners announced daily.
• Complete your session evaluation through the Cisco Live mobile app or visit one of the interactive kiosks located throughout the convention center.
Don’t forget: Cisco Live sessions will be available for viewing on-demand after the event at CiscoLive.com/Online
117
© 2014 Cisco and/or its affiliates. All rights reserved. BRKCRS-3143 Cisco Public
Continue Your Education
• Demos in the Cisco Campus
• Walk-in Self-Paced Labs
• Table Topics
• Meet the Engineer 1:1 meetings
118