+ All Categories
Home > Documents > Ether Rake

Ether Rake

Date post: 02-Jun-2018
Category:
Upload: rex-mendoza-orense
View: 236 times
Download: 0 times
Share this document with a friend

of 36

Transcript
  • 8/11/2019 Ether Rake

    1/36

    EtherRake: Diagnosis and Monitoring

    in Data Center & Enterprise Networks

    Lab for Internet and Security Technology (LIST)

    Northwestern Univ.

  • 8/11/2019 Ether Rake

    2/36

    General Idea of EtherRake

    Problem statement:

    Emerging DC and enterprise networks are

    mainly comprised of large # of switches

    which need monitoring and diagnosis

    2

  • 8/11/2019 Ether Rake

    3/36

  • 8/11/2019 Ether Rake

    4/36

    Collector at each switches

    Take Cisco switches for example

    Port information

    show port status (display interface ethernet0/1for huawei)

    Neighbor Information

    show CDP neighbors

    Forwarding tables (aka switch table)

    show MACinterface mapping

    4

  • 8/11/2019 Ether Rake

    5/36

    Collector at each switches

    Port information

    Port Number: 2 Bytes

    Status: 4 bits

    Total: 3 Bytes * 100 = 300 Bytes < 0.4KB per switch Neighbor Information

    Mac Address: 48 bits

    Total: 6Bytes* 100 = 600 Bytes < 0.6KB per switch

    Forwarding Tables

    To be decided. We are not using it in our approach now.

    We can transfer updates only which means normally we

    dont need to transfer anything.

    Total: 1 KB * 1024 (number of switches) = 1MB in one round.5

  • 8/11/2019 Ether Rake

    6/36

    Collector at each switches

    Synchronization

    Cristian's algorithm (P is processing center, and

    S is a collector)

    P requests the time from S

    After receiving the request from P, S prepares a

    response and appends the time T from its own clock.

    P then sets its time to be T + RTT/2

    Multiple measurement can reduce the error.

    Accuracy. (T + min) to (T + RTT - min) where

    min is the minimum one-way time.

    6

  • 8/11/2019 Ether Rake

    7/36

    Monitor Plane

    Monitor Plane is a plane that co-exists with

    data plane and control plane in the same

    channel. It is used to transfer monitoring

    data.

    7

    Assist Adjust Monitor

    Control

  • 8/11/2019 Ether Rake

    8/36

    Monitor Plane

    Monitor plane is used to collect data for

    monitoring data plane.

    Switching in monitor plane has two

    methods.

    Normally, control plane will assist monitor

    plane forwarding.

    Under error, monitor plane will do flooding.

    8

  • 8/11/2019 Ether Rake

    9/36

    Processing Center

    Collect port information, forwarding tables and

    neighbor information from all the switches.

    Construct the logical topology of switches

    based on the port & neighbor info

    Detect loops in the logical topology for STP loop

    problems

    Check for any missing/dead switches

    9

  • 8/11/2019 Ether Rake

    10/36

    Problems to Solve

    STP Error Detection

    End-to-end Error Detection

    Other Hardware/Software Errors ofSwitches and Their Detection

    TRILL Potential Problems

    10

  • 8/11/2019 Ether Rake

    11/36

    End-to-end Connectivity

    Monitoring

    Based on the neighbor and port information,

    check if all switches and end hosts are on a

    connected ST.

    End hosts are also neighbors for leaf node

    switch.

    Forwarding table also records info of past

    connectivity

    11

  • 8/11/2019 Ether Rake

    12/36

    One-Way Link Problem. No backward frames. From EtherRakes view, interface of the other

    direction is dead.

    Deferred Frames. Buffer is full. Frames have to bedropped.

    Encode the buffer status (e.g., full) to the status bit

    Links between switches and routersdisabled/unactivated.

    Detected by the port status bits or lack of heartbeat

    Switches down, e.g., unbootable IOS problems12

    Other Software Errors of Switchesand its Detection

  • 8/11/2019 Ether Rake

    13/36

    Some errors have to be detected at the dataplane or application plane.

    VLAN Problems. Hosts in the same VLAN cannot

    communicate with each other.

    13

    Limitations on Other SwitchSoftware Errors Detection

  • 8/11/2019 Ether Rake

    14/36

    Hardware Errors of Switches and its

    Detection

    Switch Port Errors.

    Switch Module Errors.

    Both will be detected by the port status

    reports

    14

  • 8/11/2019 Ether Rake

    15/36

    STP Errors (1)

    Count to Infinity when removing the root

    4

    2

    5

    1

    3

    1

    ,2

    1,1

    1,1

    1,2

    4

    2

    5

    3

    1

    ,2

    1,

    2

    1,2

    1,

    2

    4

    2

    5

    3

    1,3

    1,

    3

    1,3

    1,

    3

    4

    2

    5

    3

    1,4

    1,

    4

    1,4

    1,

    4

  • 8/11/2019 Ether Rake

    16/36

    STP Errors (2)

    Forwarding Loops

    BPDU Loss Induced Forwarding Loops. If the

    blocked port fails to receive BPDUs from its

    peer bridge for an extended period of time, itmay start forwarding data.

  • 8/11/2019 Ether Rake

    17/36

    STP Errors (3)

    Forwarding Loops

    MaxAge Induced Forwarding Loops (MaxAge

    = 6)

  • 8/11/2019 Ether Rake

    18/36

    STP Errors (4)

    Forwarding Loops

    Count to Infinity Induced Forwarding Loops

    Pollution of Forwarding Tables

  • 8/11/2019 Ether Rake

    19/36

    Previous STP Errors Detection

    EtherFuse (sigcomm 07)

    Plug a fuse into Ethernet

    Problem Remaining

    Where to plug it?

    How many do we need?

    19

  • 8/11/2019 Ether Rake

    20/36

    Previous STP Errors Detection

    Cisco Prevention Methods

    Loop Guard. Prevent loss BPDU induced

    loops.

    20

  • 8/11/2019 Ether Rake

    21/36

    Some Existing Solutions

    Cisco Discovery Protocol (CDP)

    Discovery cisco apparatus in neighborhood

    Monitoring aliveness of neighboring nodes

    Limitations

    No detail status report for diagnosis

    Limited by one hop.

    Cisco Unidirectional Link Detection (UDLD). Detect One-Way Link Problem.

    21

  • 8/11/2019 Ether Rake

    22/36

    General Monitoring Metrics for

    Detection

    Connectivity. Based on frames tree,

    EtherRake can find the connectivity of a path.

    Delay. EtherRake can link frames and

    calculate the time spent on each switch.

    Throughput. EtherRake can calculate

    throughput by collected frames.

    22

  • 8/11/2019 Ether Rake

    23/36

    TRILL Potential Problems

    Routing loops

    Caused by inconsistent views of network topology.

    Mitigated using hop count

    Scalability issue:

    No clear idea on how much TRILL can scale

    23

  • 8/11/2019 Ether Rake

    24/36

    Backup

    24

  • 8/11/2019 Ether Rake

    25/36

    Detection of STP Errors by

    EtherRake

    Find STP errors by EtherRake.

    Link collected frames into traces

    Detect frame forwarding loops

    Leverage on the switch and ARP table info

    Challenges

    Scalability: optimize collection of traces

    Ambiguity and accuracy: frame linking

  • 8/11/2019 Ether Rake

    26/36

    End-to-end Connectivity

    Monitoring

    Diagnose Connectivity Problem from A to

    B by EtherRake

    Find the frames that are on the way from A to

    B.

    Link the frames and find a path.

    Locate the problem.

    26

  • 8/11/2019 Ether Rake

    27/36

  • 8/11/2019 Ether Rake

    28/36

    IP Router ErrorsOSPF (2)

    Routing Load on Processors

    28

  • 8/11/2019 Ether Rake

    29/36

    IP Router ErrorsOSPF (3)

    Route Flaps. Routing table changes in a

    router, usually in response to a network

    failure or a recovery.

    29

  • 8/11/2019 Ether Rake

    30/36

  • 8/11/2019 Ether Rake

    31/36

    IP Router ErrorsDHCP

    DHCP problem

    Configuration problem.

    Inability to acquire or renew a lease.

    How to keep the same IP address in multi-

    boot machines?

    31

  • 8/11/2019 Ether Rake

    32/36

    EtherFuse (1)

    A Ethernet Fuse that is plugged into the

    network for monitoring the status of

    network.

    32

  • 8/11/2019 Ether Rake

    33/36

    EtherFuse (2)

    Detection of Count to Infinity

    Detecting cost to the same root R of BPDUs

    33

  • 8/11/2019 Ether Rake

    34/36

    Detection of Forwarding Loops.

    Combination of Passive Sniffing and Active

    Probing.

    34

  • 8/11/2019 Ether Rake

    35/36

    Package View Switching

    Forwarding packages from the view of

    packages.

    Each package will have memory about the

    history of the path it has already gone

    through and decide which way to go based

    on the memory it has.

    Here is the steps. (Generally speaking, itis deep-first searching from the view of

    packages.)35

  • 8/11/2019 Ether Rake

    36/36

    Package View Switching

    (1) Normally, when a package arrives at a switch, it will

    choose the default port which is the port that control

    plane provide.

    (2) If the package has already tried the default port, it will

    randomly choose a new port that it has never been to.

    (3) If the package tried every port at this switch, it will go

    back to the port where it is from.

    (4) Package will be discarded when it arrived at its origin

    and finds no other way to go. Or package arrives at thedestination which is the monitor center.

    36


Recommended