Download - Enabling Flow-Level Latency Measurements Across Routers in Data Centers_ppt

8/6/2019 Enabling Flow-Level Latency Measurements Across Routers in Data Centers_ppt

1/15

Enabling Flow-level Latency

Measurements across Routers in Data

Centers

Parmjeet Singh, Myungjin Lee

Sagar Kumar, Ramana Rao Kompella


2/15

Latency-critical applications in data centers

Guaranteeing low end-to-end latency is important Web search (e.g., Googles instant search service)

Retail advertising

Recommendation systems

High-frequency trading in financial data centers

Operators want to troubleshoot latency anomalies

End-host latencies can be monitored locally

Detection, diagnosis and localization through a network: no

native support of latency measurements in a router/switch


3/15

Prior solutions

Lossy Difference Aggregator (LDA) Kompella et al. [SIGCOMM 09]

Aggregate latency statistics

Reference Latency Interpolation (RLI) Lee et al. [SIGCOMM 10]

Per-flow latency measurements

More suitable due to more fine-grained measurements


4/15

Deployment scenario of RLI

Upgrading all switches/routers in a data center network Pros

Provide finest granularity of latency anomaly localization

Cons

Significant deployment cost

Possible downtime of entire production data centers

In this work, we are considering partial deployment of RLI

Our approach: RLI across Routers (RLIR)


5/15

Overview of RLI architecture

Goal

Latency statistics on a per-flowbasis between interfaces

Problem setting

No storing timestamp for each packet at ingress and egressdue to high storage and communication cost

Regular packets do not carry timestamps

Router

Ingress I

Egress E


6/15

Overview of RLI architecture

Premise of RLI: delay locality

Approach

1) The injector sends reference packets regularly

2) Reference packet carries ingress timestamp

3) Linear interpolation: compute per-packet latency estimates atthe latency estimator

4) Per-flow estimates by aggregating per-packet estimates

Latency

Estimator

Reference

Packet

Injector

Ingress I Egress E

2

1

R

L

Delay

Time

R

L

1

Linear interpolation

line

2

Interpolateddelay


7/15

Full vs. Partial deployment

Full deployment: 16 RLI sender-receiver pairs

Partial deployment: 4 RLI senders + 2 RLI receivers

81.25 % deployment cost reduction

Switch 1 Switch 5

Switch 2 Switch 4

Switch 3

Switch 6

RLI Sender (Reference Packet Injector) RLI Receiver (Latency Estimator)


8/15

Case 1: Presence of cross traffic

Issue: Inaccurate link utilization estimation at the sender

leads to high reference packet injection rate

Approach Not actively addressing the issue

Evaluation shows no much impact on packet loss rate increase

Details in the paper

Switch 1 Switch 5

Switch 2 Switch 4

Switch 3

Switch 6


Link utilization

estimation on Switch 1Bottleneck

LinkCross

Traffic


9/15

Case 2: RLI Sender side

Issue: Traffic may take different routes at an intermediate

switch Approach: Sender sends reference packets to all receivers

Switch 1 Switch 5

Switch 2 Switch 4

Switch 3

Switch 6



10/15

Case 3: RLI Receiver side

Issue: Hard to associate reference packets and regularpackets that traversed the same path

Approaches Packet marking: requires native support from routers

Reverse ECMP computation: reverse engineer intermediateroutes using ECMP hash function

IP prefix matching at limited situation

Switch 1 Switch 5

Switch 2 Switch 4

Switch 3

Switch 6



11/15

Deployment example in fat-tree topology


IP prefix matching Reverse ECMP computation /

IP prefix matching


12/15

Evaluation

Simulation setup Trace: regular traffic (22.4M pkts) + cross traffic (70M pkts)

Simulator

Results

Accuracy of per-flow latency estimates

Traffic

DividerSwitch1 Switch2

Cross TrafficInjector

Packet

Trace

RLI

Receiver

RLI

Sender

CrossTraffic

RegularTraffic Referencepackets

10% / 1%

injection rate


13/15

67%

Accuracy of per-flow latency estimates

10% injection

1% injection 10% injection

1% injection

Bottleneck link utilization: 93%

Relative error

CDF

1.2% 4.5% 18% 31%


14/15

Summary

Low latency applications in data centers Localization of latency anomaly is important

RLI provides flow-level latency statistics, but full

deployment (i.e., all routers/switches) cost is expensive

Proposed a solution enabling partial deployment of RLI

No too much loss in localization granularity (i.e., every other

router)


15/15

Thank you! Questions?