+ All Categories
Home > Documents > Aci Troubleshooting Book

Aci Troubleshooting Book

Date post: 14-Apr-2016
Category:
Upload: anon545428675
View: 171 times
Download: 25 times
Share this document with a friend
Description:
Aci Troubleshooting Book
230
ACI Troubleshooting Book Documentation Release 1.0.1 Andres Vega Bryan Deaver Jerry Ye Kannan Ponnuswamy Loy Evans Mike Timm Paul Lesiak Paul Raytick July 09, 2015 Contents 1 Preface 3 1.1 Authors and Contributors ......................................... 3 1.2 Authors .................................................. 3 1.3 Distinguished Contributors ........................................ 3 1.4 Dedications ................................................ 4 1.5 Acknowledgments ............................................ 5 1.6 Book Writing Methodology ........................................ 6 1.7 Who Should Read This Book? ...................................... 7 Expected Audience ............................................ 7 1.8 Organization of this Book ......................................... 7 Section 1: Introduction to ACI Troubleshooting ........................... 7 Section 2: Sample Reference Topology ................................. 7 Section 3: Troubleshooting Application Centric Infrastructure ................... 7 2 Application Centric Infrastructure 8 3 ACI Policy Model 9 3.1 Abstraction Model ............................................ 10 3.2 Everything is an Object .......................................... 10 3.3 Relevant Objects and Relationships .................................... 10 3.4 Hierarchical ACI Object Model and the Infrastructure .......................... 13 3.5 Infrastructure as Objects ......................................... 15 3.6 Build object, use object to build policy, reuse policy ........................... 17 3.7 REST API just exposes the object model ................................. 17 3.8 Logical model, Resolved model, concrete model ............................. 18 3.9 Formed and Unformed Relationships ................................... 18 3.10 Declarative End State and Promise Theory ................................ 19 4 Troubleshooting Tools 23 4.1 APIC Access Methods .......................................... 24 GUI .................................................... 24 API .................................................... 25
Transcript
  • ACI Troubleshooting BookDocumentation

    Release 1.0.1

    Andres Vega Bryan Deaver Jerry YeKannan Ponnuswamy Loy Evans Mike Timm Paul Lesiak

    Paul Raytick

    July 09, 2015

    Contents

    1 Preface 31.1 Authors and Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Distinguished Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Dedications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.5 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.6 Book Writing Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.7 Who Should Read This Book? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    Expected Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.8 Organization of this Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    Section 1: Introduction to ACI Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Section 2: Sample Reference Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Section 3: Troubleshooting Application Centric Infrastructure . . . . . . . . . . . . . . . . . . . 7

    2 Application Centric Infrastructure 8

    3 ACI Policy Model 93.1 Abstraction Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.2 Everything is an Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.3 Relevant Objects and Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.4 Hierarchical ACI Object Model and the Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . 133.5 Infrastructure as Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.6 Build object, use object to build policy, reuse policy . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.7 REST API just exposes the object model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.8 Logical model, Resolved model, concrete model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.9 Formed and Unformed Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.10 Declarative End State and Promise Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    4 Troubleshooting Tools 234.1 APIC Access Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

  • CLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.2 Programmatic Configuration (Python) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.3 Fabric Node Access Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.4 Exporting information from the Fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.5 External Data Collection Syslog, SNMP, Call-Home . . . . . . . . . . . . . . . . . . . . . . . . . 274.6 Health Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.7 Atomic Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    5 Troubleshooting Methodology 275.1 Overall Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    6 Sample Reference Topology 296.1 Physical Fabric Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296.2 Logical Application Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    7 Troubleshooting 317.1 Naming Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Suggested Naming Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    8 Initial Hardware Bringup 348.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Verification/Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Verification/Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    8.3 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    9 Fabric Initialization 389.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389.2 Fabric Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409.3 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

    Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Symptom 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Symptom 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

    10 APIC High Availablity and Clustering 4810.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4810.2 Cluster Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4910.3 Majority and Minority - Handling Clustering Split Brains . . . . . . . . . . . . . . . . . . . . . . . . 4910.4 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

    Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

    10.5 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

  • Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

    10.6 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    10.7 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    10.8 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    10.9 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    10.10Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    11 Firmware and Image Management 5311.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    APIC Controller and Switch Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54Firmware Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54Compatibility Check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55Firmware Upgrade Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55Verifying the Firmware Version and the Upgrade Status by of use of the REST API . . . . . . . . . . 56

    11.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    11.3 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    11.4 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

    12 Faults / Health Scores 5812.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5812.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

    Symptom 1: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60Resolution 1: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

    12.3 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64Symptom 1: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64Resolution 1: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65Resolution 2: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

    13 REST Interface 6513.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

    ACI Object Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69APIC REST API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70Payload Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

  • Read Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72Write Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74API Inspector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74ACI Software Development Kit (SDK) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76Establishing a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76Working with Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76APIC REST to Python Adapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

    13.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81Verification 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82Symptom 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82Symptom 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82Symptom 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84Symptom 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

    14 Management Tenant 8414.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

    Fabric Management Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85Out-Of-Band Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85Inband Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86Layer 2 Inband Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86Layer 2 Configuration Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87Layer 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88Layer 3 Inband Configuration Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88APIC Management Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Fabric Node (Switch) Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90Management Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91Management EPG Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91Fabric Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

    14.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

    14.3 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

  • 15 Common Network Services 9915.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

    Fabric Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101APIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107Fabric nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

    15.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114Symptom 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114Symptom 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

    15.3 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

    16 Unicast Data Plane Forwarding and Reachability 12016.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

    Verification - Endpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122Verification - VLANs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124Verfication - Forwarding Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

    16.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

    16.3 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

    17 Policies and Contracts 13617.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

    Verification of Zoning Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13717.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

    Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150Symptom 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152Symptom 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

  • 18 Bridged Connectivity to External Networks 15318.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15318.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

    Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

    18.3 Problem Description: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160Symptom 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160Verification/Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160Symptom 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164Verification/Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

    18.4 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

    19 Routed Connectivity to External Networks 17019.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

    External Route Distribution Inside Fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17019.2 Fabric Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

    Output from Spine 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171Output from Spine 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

    19.3 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173Verification/Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

    19.4 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173Verification 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175Verification 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175Resolution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176Resolution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

    19.5 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

    20 Virtual Machine Manager and UCS 18420.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18420.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

    Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

    20.3 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

  • Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

    20.4 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

    20.5 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194Symptom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

    21 L4-L7 Service Insertion 19721.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

    Device Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197Service Graph Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198Concrete Device and Logical Device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200Device Cluster Selector Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201Rendering the Service Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

    21.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203Symptom 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203Symptom 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203Symptom 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204Symptom 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204Symptom 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204Symptom 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

    22 ACI Fabric Node and Process Crash Troubleshooting 20522.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

    DME Processes: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205CLI: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205Identify When a Process Crashes: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206Collecting the Core Files: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

    22.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

    23 APIC Process Crash Troubleshooting 21123.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

    DME Processes: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211How to Identify When a Process Crashes: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213Collecting the Core Files: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

    23.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

  • Symptom 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215Symptom 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

    24 Appendix 21824.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

    A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221J . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

    25 Indices and tables 223

  • Table of Contents:

    1 Preface

    Authors and Contributors Authors Distinguished Contributors Dedications Acknowledgments Book Writing Methodology Who Should Read This Book?

    Expected Audience Organization of this Book

    Section 1: Introduction to ACI Troubleshooting Section 2: Sample Reference Topology Section 3: Troubleshooting Application Centric Infrastructure

    1.1 Authors and Contributors

    This book represents a joint intense collaborative effort between Ciscos Engineering, Technical Support, AdvancedServices and Sales employees over a single week in the same room at Cisco Headquarters Building 20 in San Jose,CA.

    1.2 Authors

    Andres Vega - Cisco Technical ServicesBryan Deaver - Cisco Technical ServicesJerry Ye - Cisco Advanced ServicesKannan Ponnuswamy - Cisco Advanced ServicesLoy Evans - Systems EngineeringMike Timm - Cisco Technical ServicesPaul Lesiak - Cisco Advanced ServicesPaul Raytick - Cisco Technical Services

    1.3 Distinguished Contributors

    Giuseppe AndreelloPavan BassettySachin JainSri GoliContributors

  • Lucien AvramovPiyush AgarwalPooja AnikerRyan BosMike BrownRobert BurnsMai CutlerTomas de LeonLuis FloresMaurizio PortolaniMichael FraseMioljub JovanovicOzden KarakokJose MartinezRafael MuellerChandra NagarajanMike PetrinovicDaniel PitaMike RipleyZach SeilsRamses SmeyersSteve Winters

    1.4 Dedications

    For Olga and Victoria, my love and happiness, hoping for a world that continues to strive in providing more effectivesolutions to all problems intrinsic to human nature. - Andres Vega

    An appreciative thank you to my wife Melanie and our children Sierra and Jackson for their support. And also tothose that I have had the opportunity to work with over the years on this journey with Cisco. - Bryan Deaver

    For my parents, uncles, aunts, cousins, and wonderful nephews and nieces in the US, Australia, Hong Kong andChina. - Jerry Ye

    To my wife, Vanitha for her unwavering love, support and encouragement, my kids Kripa, Krish and Kriti for thesweet moments, my sister who provided me the education needed for this book, my brother for the great times, andmy parents for their unconditional love. - Kannan Ponnuswamy

    Would like to thank my amazing family, Molly, Ethan and Abby, without whom, none of the things I do wouldmatter. - Loy Evans

    Big thanks to my wife Morena, my daughters Elena and Mayra. Thank you to my in-laws Guadalupe and Armandowho helped watch my beautiful growing girls while I spent the time away from home working on this project. -Michael Timm

    Dedicated to the patience, love and continual support of Amanda; my sprite and best friend - Paul Lesiak

    For Susan, Matthew, Hanna, Brian, and all my extended family, thanks for your support throughout the years. Thanksas well to Cisco for the opportunity, it continues to be a fun ride. - Paul Raytick

  • 1.5 Acknowledgments

    While this book was produced and written in a single week, the knowledge and experience leading to it are the resultof hard work and dedication of many individual inside and outside Cisco.

    Special thanks to Ciscos INSBU Executive, Technical Marketing and Engineering teams who supported the realizationof this book. We would like to thank you for your continuous innovation and the value you provide to the industry.

    We want to thank Ciscos Advanced Services and Technical Services leadership teams for the trust they conferred tothis initiative and the support provided since the inception of the idea.

    In particular we want to express gratitude to the following individuals for their influence and support both prior andduring the book sprint:

    Shrey AjmeraSubrata BanerjeeDave BroenenJohn BunneyLuca CafieroRavi ChamarthyMike CohenKevin CorbinRonak DesaiKrishna DoddapaneniMike DvorkinTom EdsallKen FeeVikki FeeSiva GaggaraShilpa GrandhiRam GunugantiRuben HakopianRobert HurstDonna HutchinsonFabio IngraoSaurabh JainPraveen JainPrem JainSoni JiandaniSarat KamisettyYousuf KhanPraveen KumarTighe KuykendallAdrienne LiuAnand LouisGianluca MardenteWayne McAllisterRohit MedirattaMunish Mehta

  • Sameer MerchantJoe OnisickIgnacio OrozcoVenkatesh PallipadiAyas PaniAmit PatelMaurizio PortolaniPirabhu RamanAlice SaikiChristy SandersEnrico SchiattarellaPriyanka ShahPankaj ShuklaMichael SmithEdward SwensonSrinivas TatikondaSanthosh ThodupunooriSergey TimokhinMuni TripathiBobby VandaloreSunil VermaAlok WadhwaJay WeinsteinYi Xue

    We would also like to thank the Office of the CTO and Chief Architect for their hospitality while working in theiroffice space.

    We are truly grateful to our book sprint facilitators Laia Ros and Adam Hyde for carrying us throughout this collabo-rative knowledge production process, and to our illustrator Henrik van Leeuwen who took abstract ideas and was ableto depict those ideas into clear visuals. Our first concern was how to take so many people from different sides of thebusiness to complete a project that traditionally takes months. The book sprint team showed that this is possible andpresents a new model for how we collaborate, extract knowledge and experience and present it into a single source.

    1.6 Book Writing Methodology

    The Book Sprint (www.booksprints.net) methodology was used for writing this book. The Book Sprint methodology isan innovative new style of cooperative and collaborative authorship. Book Sprints are strongly facilitated and leverageteam-oriented inspiration and motivation to rapidly deliver large amounts of well authored and reviewed content, andincorporate it into a complete narrative in a short amount of time. By leveraging the input of many experts, thecomplete book was written in a short time period of only five days, however involved hundreds of authoring manhours, and included thousands of experienced engineering hours, allowing for extremely high quality in a very shortproduction time period.

  • 1.7 Who Should Read This Book?

    Expected Audience

    The intended audience for this book is those with a general need to understand how to operate and/or troubleshootan ACI fabric. While operation engineers may experience the largest benefit from this content, the materials includedherein may be of use to a much wider audience, especially given modern industry trends towards continuous integrationand development, along with the ever growing need for agile DevOps oriented methodologies.

    There are many elements in this book that explore topics outside the typical job responsibilities of network admin-istrators. For example the programmatic manipulation of policy models can be viewed as a development-orientedtask, however has specific relevance to networking configuration and function, taking a very different approach thantraditional CLI-based interface configuration.

    1.8 Organization of this Book

    Section 1: Introduction to ACI Troubleshooting

    The introduction covers basic concepts, terms and models while introducing the tools that will be used in troubleshoot-ing. Also covered are the troubleshooting, verification and resolution methodologies used in later sections that coverthe actual problems being documented.

    Section 2: Sample Reference Topology

    This section sets the baseline sample topology used throughout all of the troubleshooting exercises that are documentedlater in the book. Logical diagrams are provided for the abstract policy elements (the endpoint group objects, theapplication profile objects, etc) as well as the physical topology diagrams and any supporting documentation thatis needed to understand the focal point of the exercises. In each problem description in Section 3, references willbe made to the reference topology as necessary. Where further examination is required, the specific aspects of thetopology being examined may be re-illustrated in the text of the troubleshooting scenario.

    Section 3: Troubleshooting Application Centric Infrastructure

    The Troubleshooting ACI section goes through specific problem descriptions as it relates to the fabric. For eachiterative problem, there will be a problem description, a listing of the process, some verification steps, and possibleresolutions.

    Chapter format: The chapters that follow in the Troubleshooting section document the various problems: verificationof causes and possible resolutions are arranged in the following format.

    Overview: Provides an introduction to the problem in focus by highlighting the following information:

    Theory and concepts to be covered

    Information of what should be happening

    Verification steps of a working state

    Problem Description: The problem description will be a high level observation of the starting point for the trou-bleshooting actions to be covered. Example: a fabric node is showing inactive from the APIC by using APIC CLIcommand acidiag fnvread.

    Symptoms: Depending on the problem, various symptoms and their impacts may be observed. In this example, someof the symptoms and indications of issues around an inactive fabric node could be:

    loss of connectivity to the fabric

  • low health score

    system faults

    inability to make changes through the APIC

    In some chapters, multiple symptoms could be observed for the same problem description that require different verifi-cation or resolution.

    Verification: The logical set of steps to identify what is being observed will be indicated along with the appropriatetools and output. Additionally, some information on likely causes and resolution will be included.

    2 Application Centric Infrastructure

    In the same way that humans build relationships to communicate and share their knowledge, computer networks arebuilt to allow for nodes to exchange data at ever increasing speeds and rates. The drivers for these rapidly growingnetworks are the applications, the building blocks that consume and provide the data which are close to the heart ofthe business lifecycle. The organizations tasked with nurturing and maintaining these expanding networks, nodes andvast amounts of data, are critical to those that consume the resources they provide.

    IT organizations have managed the conduits of this data as network devices with each device being managed individ-ually. In the efforts to support an application, a team or multiple teams of infrastructure specialists build and configurestatic infrastructure including the following:

    Physical infrastructure (switches, ports, cables, etc.)

    Logical topology (VLANs, L2 interfaces and protocols, L3 interfaces and protocols, etc.)

    Access control configuration (permit/deny ACLs) for application integration and common services)

    Quality of Service configuration

    Services integration (firewall, load balancing, etc.)

    Connecting application workload engines (VMs, physical servers, logical application instances)

    Cisco seeks to innovate the way this infrastructure is governed by introducing new paradigms. Going from a networkof individually managed devices to an automated policy-based model that allows an organization to define the policy,and the infrastructure to automate the implementation of the policy in the hardware elements, will change the way theworld communicates.

    To this end, Cisco has introduced Application Centric Infrastructure, or ACI, as an holistic systems-based approachto infrastructure management.

    The design intent of ACI is to provide the following:

    Application-driven policy modeling

    Centralized policy management and visibility of infrastructure and application health

    Automated infrastructure configuration management

    Integrated physical and virtual infrastructure management

    Open interface to enable flexible software and ecosystem partner integration

    Seamless communications from any endpoint to any endpoint

    There are multiple possible implementation options for an ACI fabric:

    Leveraging a network centric approach to policy deployment - in this case a full understanding of applicationinterdependencies is not critical, and instead the current model of a network-oriented design is maintained. Thiscan take one of two forms:

  • L2 Fabric Uses the ACI policy controller to automate provisioning of network infrastructure based onL2 connectivity between connected network devices and hosts.

    L3 Fabric Uses the ACI policy model to automate provisioning network infrastructure based on L3connectivity between network devices and hosts.

    Application-centric fabric takes full advantage of all of the ACI objects to build out a flexible and completelyautomated infrastructure including L2 and L3 reachability, physical machine and VM connectivity integration,service node integration and full object manipulation and management.

    Implementations of ACI that take full advantage of the intended design from an application-centric perspectiveallow for end-to-end network automation spanning physical and virtual network and network services integra-tion.

    All of the manual configuration and integration work detailed above is thus automated based on policy, thereforemaking the infrastructure teams efforts more efficient.

    Instead of manually configuring VLANs, ports and access lists for every device connected to the network, the policyis created and the infrastructure itself resolves and provisions the relevant configuration to be provisioned on demand,where needed, when needed. Conversely, when devices, applications or workloads detach from the fabric, the relevantconfiguration can be de-provisioned, allowing for optimal network hygiene.

    Cisco ACI follows a model-driven approach to configuration management. This model-based configuration is dissem-inated through the managed nodes using the concept of Promise Theory.

    Promise Theory is a management model in which a central intelligence system declares a desired configuration end-state, and the underlying objects act as autonomous intelligent agents that can understand the declarative end-stateand either implement the required change, or send back information on why it could not be implemented.

    In ACI, the intelligent agents are purpose-built elements of the infrastructure that take an active part in its managementby the keeping of promises. Within promise theory, a promise is an agents declaration of intent to follow an intendedinstruction defining operational behavior. This allows management teams to create an abstract end-state model andthe system to automate the configuration in compliance. With declarative end-state modeling, it is easier to build andmanage networks of all scale sizes with less effort.

    Many new ideas, concepts and terms come with this coupling of ACI and Promise Theory. This book is not intendedto be a complete tutorial on ACI or Promise Theory, nor is it intended to be a complete operations manual for ACI, ora complete dictionary of terms and concepts. Where possible, however, a base level of definitions will be provided,accompanied by explanations. The goal is to provide common concepts, terms, models and fundamental features ofthe fabric, then use that base knowledge to dive into troubleshooting methodology and exercises.

    To read more information on Ciscos Application Centric Infrastructure, the reader may refer to the Cisco website athttps://www.cisco.com/go/aci.

    3 ACI Policy Model

  • Abstraction Model Everything is an Object Relevant Objects and Relationships Hierarchical ACI Object Model and the Infrastructure Infrastructure as Objects Build object, use object to build policy, reuse policy REST API just exposes the object model Logical model, Resolved model, concrete model Formed and Unformed Relationships Declarative End State and Promise Theory

    While the comprehensive policy model that ACI utilizes is broad, the goal of this chapter is to introduce the readerto a basic level of understanding about the model, what it contains and how to work with it. The complete objectmodel contains a vast amount of information that represents a complete hierarchy of data center interactions, so it isrecommended that the reader take the time to review the many white papers available on cisco.com, or for the mostextensive information resource available, review the APIC Management Information Model Reference packaged withthe APIC itself.

    3.1 Abstraction Model

    ACI provides the ability to create a stateless definition of application requirements. Application architects think interms of application components and interactions between such components; not necessarily thinking about networks,firewalls and other services. By abstracting away the infrastructure, application architects can build stateless policiesand define not only the application, but also Layer 4 through 7 services and interactions within applications. Abstrac-tion also means that the policy defining application requirements is no longer tied to traditional network constructs,and thus removes dependencies on the infrastructure and increases the portability of applications.

    The application policy model defines application requirements, and based on the specified requirements, each devicewill instantiate a set of required changes. IP addresses become fully portable within the fabric, while security and for-warding are decoupled from any physical or virtual network attributes. Devices autonomously and consistently updatethe state of the network based on the configured policy requirements set within the application profile definitions.

    3.2 Everything is an Object

    The abstracted model utilized in ACI is object-oriented, and everything in the model is represented as an object, eachwith properties relevant to that object. As is typical for an object-oriented system, these objects can be grouped,classed, read, and manipulated, and objects can be created referencing other objects. These objects can referencerelevant application components as well as relationships between these components. The rest of this section willdescribe the elements of the model, the objects inside, and their relationships at a high level.

    3.3 Relevant Objects and Relationships

    Within the ACI application model, the primary object that encompasses all of the objects and their relationships toeach other is called an Application Profile, or AP. Some readers are certain to think, a 3-tier app is a unicorn, but inthis case, the idea of a literal 3-tier application works well for illustrative purposes. Below is a diagram of an AP shownas a logical structure for a 3-tier application that will serve well for describing the relevant objects and relationships.

  • From left to right, in this 3-tier application there is a group of clients that can be categorized and grouped together.Next there is a group of web servers, followed by a group of application servers, and finally a group of database servers.There exist relationships between each of these independent groups. For example, from the clients to the applicationservers, there are relationships that can be described in the policy which can include things such as QoS, ACLs,Firewall and Server Load Balancing service insertion. Each of these things is defined by managed objects, and therelationships between them are used to build out the logical model, then resolve them into the hardware automatically.

    Endpoints are objects that represent individual workload engines (i.e. virtual or physical machines, etc.). The followingdiagram emphasizes which elements in the policy model are endpoints, which include web, application and databasevirtual machines.

    These endpoints are logically grouped together into another object called an Endpoint Group, or EPG. The followingdiagram highlights the EPG boundaries in the diagram, and there are four EPGs - Clients, Web servers, Applicationservers, and Database servers.

  • There are also Service Nodes that are referenceable objects, either physical or virtual, such as Firewalls, and ServerLoad Balancers (or Application Delivery Controllers/ADC), with a firewall and load balancer combination chainedbetween the client and web EPGs, a load balancer between the web and application EPGs, and finally a firewallsecuring traffic between the application and database EPGs.

    A group of Service Node objects can be logically chained into a sequence of services represented by another objectcalled a Service Graph. A Service Graph object provides compound service chains along the data path. The diagrambelow shows where the Service Graph objects are inserted into a policy definition, emphasizing the grouped servicenodes in the previous diagram.

    With objects defined to express the essential elements of the application, it is possible to build relationships betweenthe EPG objects, using another object called a Contract. A Contract defines what provides a service, what consumes aservice and what policy objects are related to that consumption relationship. In the case of the relationship between theclients and the web servers, the policy defines the communication path and all related elements of that. As shown in thedetails of the example below, the Web EPG provides a service that the Clients EPG consumes, and that consumptionwould be subject to a Filter (ACL) and a Service Graph that includes Firewall inspection services and Server LoadBalancing.

  • A concept to note is that ACI fabrics are built on the premise of a whitelist security approach, which allows the ACIfabric to function as a semi-stateful firewall fabric. This means communication is implicitly denied, and that one mustbuild a policy to allow communication between objects or they will be unable to communicate. In the example above,with the contract in place as highlighted, the Clients EPG can communicate with the Web EPG, but the Clients cannotcommunicate with the App EPG or DB EPGs. This is not explicit in the contract, but native to the fabrics function.

    3.4 Hierarchical ACI Object Model and the Infrastructure

    The APIC manages a distributed managed information tree (dMIT). The dMIT discovers, manages, and maintains thewhole hierarchical tree of objects in the ACI fabric, including their configuration, operational status, and accompanyingstatistics and associated faults.

    The Cisco ACI object model structure is organized around a hierarchical tree model, called a distributed ManagementInfrastructure Tree (dMIT). The dMIT is the single source of truth in the object model, and is used in discovery,management and maintenance of the hierarchical model, including configuration, operational status and accompanyingstatistics and faults.

    As mentioned before, within the dMIT, the Application Profile is the modeled representation of an application, networkcharacteristics and connections, services, and all the relationships between all of these lower-level objects. Theseobjects are instantiated as Managed Objects (MO) and are stored in the dMIT in a hierarchical tree, as shown below:

  • All of the configurable elements shown in this diagram are represented as classes, and the classes define the items thatget instantiated as MOs, which are used to fully describe the entity including its configuration, state, runtime data,description, referenced objects and lifecycle position.

    Each node in the dMIT represents a managed object or group of objects. These objects are organized in a hierarchicalstructure, similar to a structured file system with logical object containers like folders. Every object has a parent, withthe exception of the top object, called root, which is the top of the tree. Relationships exist between objects in thetree.

    Objects include a class, which describes the type of object such as a port, module or network path, VLAN, Bridge Do-

  • main, or EPG. Packages identify the functional areas to which the objects belong. Classes are organized hierarchicallyso that, for example, an access port is a subclass of the class Port, or a leaf node is a subclass of the class Fabric Node.

    Managed Objects can be referenced through relative names (Rn) that consist of a prefix matched up with a nameproperty of the object. As an example, a prefix for a Tenant would be tn and if the name would be Cisco, thatwould result in a Rn of tn-Cisco for a MO.

    Managed Objects can also be referenced via Distinguished Names (Dn), which is the combination of the scope of theMO and the Rn of the MO, as mentioned above. As an example, if there is a tenant named Cisco that is a policyobject in the top level of the Policy Universe (polUni), that would combine to give us a Dn of uni/tn-Cisco. Ingeneral, the DN can be related to a fully qualified domain name.

    Because of the hierarchical nature of the tree, and the attribute system used to identify object classes, the tree can bequeried in several ways for MO information. Queries can be performed on an object itself through its DN, on a classof objects such as switch chassis, or on a tree-level, discovering all members of an object.

    The structure of the dMIT provides easy classification of all aspects of the relevant configuration, as the applicationobjects are organized into related classes, as well as hardware objects and fabric objects into related classes thatallow for easy reference, reading and manipulation from individual object properties or multiple objects at a timeby reference to a class. This allows configuration and management of multiple similar components as efficiently aspossible with a minimum of iterative static configuration.

    3.5 Infrastructure as Objects

    ACI uses a combination of Cisco Nexus 9000 Series Switch hardware and Application Policy Infrastructure Controllersfor policy-based fabric configuration and management. These infrastructure components can be integrated with Ciscoand third-party service products to automatically provision end-to-end network solutions.

    As shown in the following diagram, the logical policy model is built through manipulation of the dMIT, either throughdirect GUI, programmatic API, or through traditional CLI methods. Once the policy is built, the intention of the policygets resolved into an abstract model, then is conferred to the infrastructure elements. The infrastructure elementscontain specific Cisco ASIC hardware that make them equipped, purpose-built agents of change that can understandthe abstraction that the policy controller presents to it, and automate the relevant concrete configuration based on theabstract model. This configuration gets applied when an endpoint connects to the fabric and first transmits traffic.

  • The purpose-built hardware providing the intelligent resolution of policy configuration is built on a spine-leaf archi-tecture providing consistent network forwarding and deterministic latency. The hardware is also able to normalize theencapsulation coming in from multiple different endpoints regardless of the type connectivity.

    If an endpoint connects to a fabric with an overlay encapsulation (such as VXLAN), uses physical port connectivityor VLAN 802.1Q tagging, the fabric can take accept that traffic, de-encapsulate, then re-encapsulate it to VXLAN forfabric forwarding, then de-encapsulate and re-encapsulate to whatever the destination expects to see. This gatewayfunction of encapsulation normalization happens at optimized hardware speeds in the fabric and creates no additionallatency or software gateway penalty to perform the operation outside of the fabric.

    In this manner, if a VM is running on VMWare ESX utilizing VXLAN, and a VM running on Hyper-V using VLAN802.1Q encapsulation, and a physical server running a bare metal database workload on top of Linux, it is possible toconfigure policy to allow each of these to communicate directly to each other without having to bounce to any separategateway function.

    This automated provisioning of end-to-end application policy provides consistent implementation of relevant connec-tivity, quality measures, and security requirements. This model is extensible, and has the potential capability to beextended into compute and storage for complete application policy-based provisioning.

    The automation of the configuration takes the Logical model, and translates it into other models, such as the Resolvedmodel and the Concrete model (Covered later in this chapter). The automation process resolves configuration infor-mation into the object and class-based configuration elements that then get applied based on the object and class. Asan example, if the system is applying a configuration to a port or a group of ports, the system would likely utilize aclass-based identifier to apply configuration broadly without manual iteration. As an example, a class is used to iden-

  • tify objects like cards, ports, paths, etc; port Ethernet 1/1 is a member of class port and a type of port configuration,such as an access or trunk port is a subclass of a port. A leaf node or a spine node is a subclass of a fabric node, andso forth.

    The types of objects and relationships of the different networking elements within the policy model can be seen in thediagram below. Each of these elements can be managed via the object model being manipulated through the APIC,and each element could be directly manipulated via REST API.

    3.6 Build object, use object to build policy, reuse policy

    The inherent model of ACI is built on the premise of object structure, reference and reuse. In order to build an AP,one must first create the building blocks with all the relevant information for those objects. Once those are created,it is possible to build other objects referencing the originally created objects as well as reuse other objects. As anexample, it is possible to build EPG objects, use those to build an AP object, and reuse the AP object to deploy todifferent tenant implementations, such as a Development Environment AP, a Test Environment AP, and a ProductionEnvironment AP. In the same fashion, an EPG used to construct a Test AP may later be placed into a Production AP,accelerating the time to migrate from a test environment into production.

    3.7 REST API just exposes the object model

    REST stands for Representative State Transfer, and is a reference model for direct object manipulation via HTTPprotocol based operations.

    The uniform ACI object model places clean boundaries between the different components that can be read or manip-ulated in the system. When an object exists in the tree, whether it is an object that was derived from discovery (suchas a port or module) or from configuration (such as an EPG or policy graph), the objects then would be exposed via

  • the REST API via a Universal Resource Indicator (URI). The structure of the REST API calls is shown below with acouple of examples.

    The general structure of the REST API commands is seen at the top. Below the general structure two specific examplesof what can done with this structured URI.

    3.8 Logical model, Resolved model, concrete model

    Within the ACI object model, there are essentially three stages of implementation of the model: the Logical Model,the Resolved Model, and the Concrete Model.

    The Logical Model is the logical representation of the objects and their relationships. The AP that was discussedpreviously is an expression of the logical model. This is the declaration of the end-state expression that is desiredwhen the elements of the application are connected and the fabric is provisioned by the APIC, stated in high-levelterms.

    The Resolved Model is the abstract model expression that the APIC resolves from the logical model. This is essentiallythe elemental configuration components that would be delivered to the physical infrastructure when the policy mustbe executed (such as when an endpoint connects to a leaf).

    The Concrete Model is the actual in-state configuration delivered to each individual fabric member based on theresolved model and the Endpoints attached to the fabric.

    In general, the logical model should be the high-level expression of what exists in the resolved model, which should bepresent on the concrete devices as the concrete model expression. If there is any gap in these, there will be inconsistentconfigurations.

    3.9 Formed and Unformed Relationships

    In creating objects and forming their relationships within the ACI fabric, a relationship is expressed when an object isa provider of a service, and another object is a consumer of that provided service. If a relationship is formed and oneside of the service is not connected, the relationship would be considered to be unformed. If a consumer exists with

  • no provider, or a provider exists with no consumer, this would be an unformed relationship. If both a consumer andprovider exist and are connected for a specific service, that relationship is fully formed.

    3.10 Declarative End State and Promise Theory

    For many years, infrastructure management has been built on a static and inflexible configuration paradigm. In termsof theory, traditional configuration via traditional methods (CLI configuration of each box individually) where config-uration must be done on every device for every possibility of every thing prior to this thing connecting, is termed anImperative Model of configuration. In this model, due to the way configuration is built for eventual possibility, thetrend is to overbuild the infrastructure configuration to a fairly significant amount. When this is done, fragility andcomplexity increase with every eventuality included.

  • Similar to what is illustrated above, if configuration must be made on a single port for an ESXi host, it must beconfigured to trunk all information for all of the possible VLANs that might get used by a vSwitch or DVS on the host,whether or not a VM actually exists on that host. On top of that, additional ACLs may need to be configured for all

  • possible entries on that port, VLAN or switch to allow/restrict traffic to/from the VMs that might end up migrating tothat host/segment/switch. That is a fairly heavyweight set of tasks for just some portions of the infrastructure, and thatcontinues to build as peripheral aspects of this same problem are evaluated. As these configurations are built, hardwareresource tables are filled up even if they are not needed for actual forwarding. Also reflected are configurations onthe service nodes for eventualities that can build and grow, many times being added but rarely ever removed. Thiseventually can grow into a fairly fragile state that might be considered a form of firewall house of cards. As thesebuilding blocks are built up over time and a broader perspective is taken, it becomes difficult to understand whichones can be removed without the whole stack tumbling down. This is one of the possible things that can happen whenthings are built on an imperative model.

  • On the other hand, a declarative mode allows a system to describe the end-state expectations system-wide as depictedabove, and allows the system to utilize its knowledge of the integrated hardware and automation tools to execute therequired work to deliver the end state. Imagine an infrastructure system where statements of desire can be made, such

  • as these things should connect to those things and let them talk in this way, and the infrastructure converges on thatdesired end state. When that configuration is no longer needed, the system knows this and removes that configuration.

    Promise Theory is built on the principles that allow for systems to be designed based on the declarative model. Itsbuilt on voluntary execution by autonomous agents which provide and consume services from one another based onpromises.

    As the IT industry continues to build and scale more and more, information and systems are rapidly reaching break-ing points where scaled-out infrastructure cannot stretch to the hardware resources without violating the economicequilibrium, nor scale-in the management without integrated agent-based automation. This is why Cisco ACI, as asystem built on promise theory, is a purpose-built system for addressing scale problems that are delivery challengeswith traditional models.

    4 Troubleshooting Tools

  • APIC Access Methods GUI API CLI

    * CLI MODES:* Navigating the CLI:

    Programmatic Configuration (Python) Fabric Node Access Methods Exporting information from the Fabric External Data Collection Syslog, SNMP, Call-Home Health Scores Atomic Counters

    This section is intended to provide an overview of the tools that could be used during troubleshooting efforts on anACI fabric. This is not intended to be a complete reference list of all possible tools, but rather a high level list of themost common tools used.

    4.1 APIC Access Methods

    There are multiple ways to connect to and manage the ACI fabric and object model. An administrator can use thebuilt-in Graphical User Interface (GUI), programmatic methods using an Application Programming Interface (API),or standard Command Line Interface (CLI). While there are multiple ways to access the APIC, the APIC is still thesingle point of truth. All of these access methods - the GUI, CLI and REST API - are just interfaces resolving to theAPI, which is the abstraction of the object model managed by the DME.

    GUI

    One of the primary ways to configure, verify and monitor the ACI fabric is through the APIC GUI. The APIC GUI isa browser-based HTML5 application that provides a representation of the object model and would be the most likelydefault interface that people would start with. GUI access is accessible through a browser at the URL https:// The GUI does not expose the underlying policy object model. One of the available tools for browsing the MIT inaddition to leveraging the CLI is called visore and is available on the APIC and nodes. Visore supports querying byclass and object, as well as easily navigating the hierarchy of the tree. Visore is accessible through a browser at theURL https:///visore.html

  • API

    The APIC supports REST API connections via HTTP/HTTPS for processing of XML/JSON documents for rapidconfiguration. The API can also be used to verify the configured policy on the system. This is covered in details in theREST API chapter.

    A common tool used to query the system is a web browser based APP that runs on Google Chrome (tm) web browsercalled Postman.

    CLI

    The CLI can be used in configuring the APIC. It can be used extensively in troubleshooting the system as it allowsreal-time visibility of the configuration, faults, and statistics of the system or alternatively as an object model browser.Typically the CLI is accessed via SSH with the appropriate administrative level credentials. The APIC CLI can beaccessed as well through the CIMC KVM (Cisco Integrated Management Console Keyboard Video Mouse interface).

    CLI access is also available for troubleshooting the fabric nodes either through SSH or the console.

    The APIC and fabric nodes are based on a Linux kernel but there are some ACI specific commands and modes ofaccess that will be used in this book.

    CLI MODES:

    APIC: The APIC has fundamentally only one CLI access mode. The commands used in this book are assumingadmininistrave level access to the APIC by use of the admin user account.

    Fabric Node: The switch running ACI software has several different modes that can be used to access differentlevels of information on the system:

    CLI - The CLI will be used to run NX-OS and Bash shell commands to check the concrete models on the switch.For example show vlan, show endpoint, etc. In some documentation this may have been referred to as Bash,iBash, or iShell.

    vsh_lc - This is the line card shell and it will be used to check line card processes and forwarding tables specificto the Application Leaf Engine (ALE) ASIC.

    Broadcom Shell - This shell is used to view information on the Broadcom ASIC. The shell will not be coveredas it falls outside the scope of this book as its assumed troubleshooting at a Broadcom Shell level should beperformed with assistance of Cisco Technical Assistance Center (TAC). Virtual Shell

    (VSH): Provides deprecated NX-OS CLI shell access to the switch. This mode can provide output on a switch inACI mode that could be inaccurate. This mode is not recommended, not supported, and commands that provideuseful output should be available from the normal CLI access mode.

    Navigating the CLI:

    There are some common commands as well as some unique differences than might be seen in NX-OS on a fabric node.On the APIC, the command structure has common commands as well as some unique differences compared to LinuxBash. This section will present a highlight of a few of these commands but is not meant to replace existing externaldocumentation on Linux, Bash, ACI, and NX-OS.

  • Common Bash commands: When using the CLI, some basic understanding of Linux and Bash is necessary. Thesecommands include:

    man prints the online manual pages. For example, man cd will display

    what the command cd does

    ls list directory contents

    cd change directory

    cat print the contents of a file less simple navigation tool for displaying the contents of a file

    grep print out a matching line from a file

    ps show current running processes typically used with the options ps ef

    netstat display network connection status. netstat a will display active and ports which the system is listeningon

    ip route show displays the kernel route table. This is useful on the APIC but not on the fabric node

    pwd print the current directory

    Common CLI commands: Beyond the normal NX-OS commands on a fabric node, there are several more that arespecific commands to ACI. Some CLI commands referenced in this guide are listed below:

    acidiag - Specifically acidiag avread and acidiag fnvread are two common commands to check the status ofthe controllers and the fabric nodes

    techsupport CLI command to collect the techsupport files from the device

    attach From the APIC, opens up a ssh session to the names node. For example attach rtp_leaf1

    iping/itraceroute Fabric node command used in place of ping/traceroute which provides similar functionalityagainst an fabric device address and VRF. Note that the Bash ping and traceroute commands do work but areeffective only for the switch OOB access.

    Help: When navigating around the APIC CLI, there are some differences when compared to NX-OS.

    - similar to NX-OS ? to get a list of command options and keywords.

    - autocomplete of the command. For example show int** will complete to **show interface.

    man - displays the manual and usage output for the command.

    4.2 Programmatic Configuration (Python)

    A popular modern programming language is Python, which provides simple object-oriented semantics in interpretedeasy-to-write code. The APIC can be configured through the use of Python through an available APIC SoftwareDevelopment Kit (SDK) or via the REST API.

    4.3 Fabric Node Access Methods

    CLI In general, most work within ACI will be done through the APIC using the access methods listed above. Thereare, however, times in which one must directly access the individual fabric nodes (switches). Fabric nodes can beaccessed via SSH using the fabric administrative level credentials. The CLI is not used for configuration but is usedextensively for troubleshooting purposes. The fabric nodes have a Linux shell along with a CLI interpreter to run showlevel commands. The CLI can be accessed through the console port as well.

  • Faults The APICs automatically detect issues on the system and records these as faults. Faults are displayed in the GUIuntil the underlying issue is cleared. After faults are cleared, they are retained until they are acknowledged or until theretaining timer has expired. The fault is composed of system parameters, which are used to indicate the reason for thefailure, and where the fault is located. Fault messages link to help to understand possible actions in some cases.

    4.4 Exporting information from the Fabric

    Techsupport The Techsupport files in ACI capture application logs, system and services logs, version information,faults, event and audit logs, debug counters and other command output, then bundle all of that into one file on thesystem. This is presented in a single compressed file (tarball) that can be exported to an external location for off-system processing. Techsupport is similar to functionality available on other Cisco products that allow for a simplecollection of copious amounts of relevant data from the system. This collection can be initiated through the GUI orthrough the CLI using the command techsupport.

    Core Files A process crash on the ACI fabric will generate a core file, which can be used to determine the reasonfor why the process crashed. This information can be exported from the APIC for decoding by Cisco support andengineering teams.

    4.5 External Data Collection Syslog, SNMP, Call-Home

    There are a variety of external collectors that can be configured to collect a variety of system data. The call-homefeature can be configured to relay information via emails through an SMTP server, for a network engineer or to CiscoSmart Call Home to generate a case with the TAC.

    4.6 Health Scores

    The APIC manages and automates the underlying forwarding components and Layer 4 to Layer 7 service devices.Using visibility into both the virtual and physical infrastructure, as well as the knowledge of the application end-to-end based on the application profile, the APIC can calculate an application health score. This health score representsthe network health of the application across virtual and physical resources, including Layer 4 to Layer 7 devices. Thescore includes failures, packet drops, and other indicators of system health.

    The health score provides enhanced visibility on both application and tenant levels. The health score can drive furthervalue by being used to trigger automated events at specific thresholds. This ability allows the network to respondautomatically to application health by making changes before users are impacted.

    4.7 Atomic Counters

    Atomic counters can be configured to monitor endpoint/EPG to endpoint/EPG traffic within a tenant for identifyingand isolating traffic loss. Once configured, the packet counters on a configured policy are updated every 30 seconds.Atomic counters are valid when endpoints reside on different leaf nodes.

    5 Troubleshooting Methodology

    Overall Methodology

  • 5.1 Overall Methodology

    Troubleshooting is the systematic process used to identify the cause of a problem. The problem to be addressed isdetermined by the difference between how some entity (function, process, feature, etc.) should be working versus howit is working. Once the cause is identified, the appropriate actions can be taken to either correct the issue or mitigatethe effects: the latter is sometimes referred to as a workaround.

    Initial efforts in the process focus around understanding more completely the issue that is occurring. Effective trou-bleshooting should be based on an evidence-driven method, rather than a symptomatic level exploration. This can bedone by asking the question:

    What evidence do we have ...?

    The intent of this question is to move towards an observed factual evidence-driven method where the evidence isgenerally taken from the system where the problem is observed.

    Troubleshooting is an iterative process attempting to isolate an issue to the point that some action can be taken to havea positive effect. Often this is a multi-step process which moves toward isolating the issue. For example, in deployingan application on a server attached to an ACI fabric, a possible problem observed could be that the application doesnot seem to respond from a client on the network. The isolation steps may look something like this:

    Troubleshooting is usually not a simple linear path, and in this example it is possible that a troubleshooter may haveobserved the system fault earlier in the process and started at that stage.

    In this example, information related to the problem came from several data points in the system. These data points canbe part of a linear causal process or can be used to better understand the scope and various points and conditions thatbetter define the issue. How these data points are collected is defined by three characteristics:

    WHAT: What information is being collected

    WHERE: Where on the system is the information being collected

    HOW: The method used in collecting the information

    For example, the state of the fabric Ethernet interface can be gathered from the leaf through CLI on the leaf in a coupleof different ways. This information can be gathered from the APIC, either through the GUI or the REST API call.When troubleshooting, it is important to understand where else relevant information is likely to come from to build abetter picture of what is the issue.

  • 6 Sample Reference Topology

    Physical Fabric Topology Logical Application Topology

    6.1 Physical Fabric Topology

    For a consistent frame of reference, a sample reference topology has been deployed, provisioned and used throughoutthe book. This ensures a consistent reference for the different scenarios and troubleshooting exercises.

    This section explores the different aspects of the reference topology, from the logical application view to the physicalfabric and any supporting details that will be used throughout the troubleshooting exercises. Each individual sectionwill call out the specific components that have been focused on so the reader does not have to refer back to this sectionin every exercise.

    The topology includes a Cisco ACI Fabric composed of three clustered Cisco APIC Controllers, two Nexus 9500 spineswitches, and three Nexus 9300 leaf switches. The APICs and Nexus 9000 switches are running the current releaseon www.cisco.com at the initial version of this book. This is APIC version 1.0(1k) and Nexus ACI-mode version11.0(1d).

    The fabric is connected to both external Layer 2 and Layer 3 networks. For the external Layer 2 network, the connec-tion use


Recommended