+ All Categories
Home > Documents > Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech...

Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech...

Date post: 27-Mar-2015
Category:
Upload: gabriella-morrow
View: 220 times
Download: 5 times
Share this document with a friend
Popular Tags:
19
Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson
Transcript
Page 1: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

Challenges in Making Tomography Practical

Yiyi Huang, Georgia TechNick Feamster, Georgia Tech

Renata Teixeira, LIP6Christophe Diot, Thomson

Page 2: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

2

Problem

• Network operators need to detect and isolate faults quickly, before customers complain

• Plenty of existing alarms– SNMP traps– Active probes– Anomaly detection systems

• Unfortunately, this set of alarms does not help operators locate and eliminate problems that induce problems on end-to-end paths

Page 3: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

3

Network Tomography to the Rescue

• Send end-to-end probes through the network• Monitor paths for differences in reachability• Infer location of reachability problem from these differences

Monitor

x

y

Targets

Page 4: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

4

Some Problems

• Scalability vs. speed: Detection must be fast

• Ambiguity: Losses are one-way but don’t always have access to both ends of the path

• Lack of synchronization: Different monitors see different conditions

• Dynamics: Topology can change, loss can be transient

Page 5: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

5

Doppler: Making Tomography Practical

• Fast, scalable detection– Solution: Monitor selection algorithm to reduce the

number of monitors and targets so that “cycle times” are fast

• Transient packet loss– Solution: Triggered confirmation of failed paths

• One-way losses– Solution: New algorithm based on IP spoofing

• Dynamic routing– Solution: Periodic snapshots of the network topology

Controlled evaluation on VINI, plus limited wide-area experiments.

Page 6: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

6

Fast, Scalable Detection

• Select monitors, targets to satisfy two conditions– All interfaces are “covered” (or diagnosable)– The number of monitors is small enough to ensure a

short round time

• Two goals– Coverage: When a failure occurs, system detects it

• Every interface is covered by at least one path– Diagnosability: When a failure occurs, system locates it

• Every interface is covered by a unique set of paths

Page 7: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

7

Offline Path Selection: Diagnosability

• Step 1: Compute the set of paths that cover all interfaces (greedy set cover heuristic)

• Step 2: Compute hitting set for each interface

• Step 3: Build equivalence classes for interfaces with common hitting set– For each interface in a set with more than one

interface, find path that crosses only that interface

Page 8: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

8

Detection, Confirmation, Correlation

• Periodic (once per 5 minutes) topology snapshot from all monitors to all destinations keeps track of underlying topology before the failure

• Detection: Periodic probes (once per “cycle time”) detect failure

• Confirmation: When a probe is lost, the monitor sends three additional probes. If all three are lost, path is determined to have failed.

• Correlation: Paths that fail within 10 seconds of one another are grouped.

Page 9: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

9

Disambiguating One-Way Losses: Spoofing

• Monitor sends request to spoofer to send probe• Probe has IP address of the monitor• If reply reaches the monitor, reverse path is

working

M

Spoofer: Send spoofed packet with source address of M

T

Page 10: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

10

Identification: NetDiagnoser

• Binary network tomography algorithm [Dhamdhere et al.]

• Input: hosts, destinations, topology before the failure

• Output: Set of possible locations for the fault

Page 11: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

11

Evaluation of Detection Algorithms

• Controlled experiments on the VINI testbed– Emulated copy of Abilene network on wide-area paths– Probing strategy emulates the paths that would be probed in monitor

selection algorithm– Compare reduced set of paths to “aggressive” measurement

approach

• Varied failure location and duration– Duration varied from 5 to 80 seconds– Test repeated for each failed link

• Measure detection and false alarm rates• Preliminary experiments using data from real-world networks

Page 12: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

12

Detection: Scale and Speed

• Compute reduction in the number of paths required to achieve coverage and diagnosability– Reduction from about 27,000 paths to 151 paths

• For real-world networks, compute corresponding reduction in cycle time– Reduction from aout 3.5 minutes to < 5 seconds

Page 13: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

13

Single-Link Failures

• More selective probing identifies more of the shorter link failures (due to shorter cycle time)

• Also results in fewer false alarms

Page 14: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

14

Single-Node Failures

• Similar results to single-link failures– Selective measurements result in faster detection,

fewer false alarms

Page 15: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

15

Does Failure Confirmation Reduce the Total Number of Alarms?

• Confirmation reduces the number of failures by > 35%• Correlation further reduces the number of alarms (by

about a factor of 10)

Page 16: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

16

How Quickly can Doppler Identify Failures?

• Answer: Roughly 20 seconds using the reduced set of paths

• Two main components– Detection/Confirmation: Time from when failure was

injected to the time Doppler could detect and confirm the failure

– Correlation: Time to group failures and construct reachability matrix

Page 17: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

17

Detection and Confirmation Delay

Most failures are detected within 3-5 seconds

Page 18: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

18

Correlation Delay

Reducing the number of paths to probe significantly reduces total correlation time

Page 19: Challenges in Making Tomography Practical Yiyi Huang, Georgia Tech Nick Feamster, Georgia Tech Renata Teixeira, LIP6 Christophe Diot, Thomson.

19

Summary

• Making tomography practical is challenging– Asynchronous measurements– Scale and speed– Changing topologies– Ambiguity about forward and reverse paths

• Doppler: Set of techniques to address many of these problems

• Current analysis is still performed offline– Many additional challenges remain to coordinate

online measurements


Recommended