Before We Start How to read a research paper? How to write a paper review? 1.

Before We Start

• How to read a research paper?

• How to write a paper review?

1

Reading A Research Paper

• Differs from reading a text book

• May not always provide all background details

• Iteratively activity

• Often requires obtaining additional background or skipping details

Active Reading

• Use a highlighter

• Write questions down

• Underline

• Go back and forth between sections

Organization• Abstract

• Introduction

• Background/Related work

• Proposed Technique

• Evaluation

• Conclusion

Abstract• Usually less than 300 words. Summarizes the

paper including brief motivation, high level description of contribution and results/ conclusions

• Stands on its own. The abstract of a paper can be removed and the rest of the paper is still complete

• Don’t need to define all terms in an abstract

• If you do define them you need to re-define in proper part of paper

Introduction

• This provides the high level motivation for the work. If the introduction is not compelling the paper will not be read

• Should also state what problem is being solved, what other work has been done that is similar, and why this work is unique

• Often ends with a list of the contributions

Introduction

VMGuest OS

App

Under client web service control

oversubscription

Motivation

• The principle bottleneck in large-scale clusters is often inter-node communication bandwidth

• Two solutions:Specialized hardware and communication protocols

e.g. Infiband, Myrinet(supercomputer environment)

Cons: No commodity parts (expensive) The protocols are not compatible with TCP/IP

Commodity Ethernet Switches Cons: Scale poorly non-linear cost increases with cluster Size.

high-end core switch, oversubscription(tradeoff)

Oversubscription Ratio

Server 1

B

……………

Server 2Server n

B

B

Upper Link Bandwidth(UB)

Oversubscription Ratio= B*n/UB

Current Data Center Topology• Edge hosts connect to 1G Top of Rack (ToR) switch• ToR switches connect to 10G End of Row (EoR) switches • Large clusters: EoR switches to 10G core switches

Oversubscription of 2.5:1 to 8:1 typical in guidelines

Key challenges: performance, cost, routing, energy, cabling

Data Center Cost

Design Goals

• Scalable interconnection bandwidth Arbitrary host communication at full bandwidth

• Economies of scale Commodity Switch

• Backward compatibility Compatible with hosts running Ethernet and IP

Related Work• Sometimes combined with background

• Briefly describes other similar research papers and tells how the work presented in the current paper differs

• Not meant to be a complete summary of each paper

• Usually categorizes other work into like groups. Often found right before conclusions

Background

• Any technical background that is required to understand the techniques that are presented should appear here.

• Often has a motivating example that can be used to explain the ideas.

Proposed Technique

• This is where you provide detail of the new material in your paper. You can present algorithms/methods/processes, etc.

• If you have a running example from the intro or background you can use this to illustrate your ideas.

Fat-Tree Topology

k/2 servers in each Rack

k/2 Edge Switches in each pod

k/2 Aggregation Switch in each pod

K Pods

Fat-tree Topology Equivalent

Routing

IP needs extension here!

(k/2)*(k/2) shortest path!

Single-Path Routing VS Multi-Path RoutingStatic VS Dynamic

ECMP（Equal-Cost Multiple-Path Routing)

Static Follow scheduling

limited multiplicity of path to 8-16 Increase routing table multiplicatively, hence latency time Advantage: Don’t need reordering! Modern Switch support!

Extract Source and

Destination Address

Hash Function(CRC1

6)

Determine which region

fall in

1 2 3 4

0 Hash-Threshold

Two-level Routing Table

• Routing Aggregation 192.168.1.2/24

192.168.1.10/24

192.168.1.45/24

192.168.1.89/24

192.168.2.3/24

192.168.2.8/24

192.168.2.10/24

0

1

192.168.1.0/24 0

192.168.2.0/24 1

Two-level Routing TableAddressing

• Using 10.0.0.0/8 private IP address

• Pod Switch: 10. pod. Switch.1. pod range is [0, k-1](left to right)

switch range is [0, k-1] (left to right, bottom to top)

• Core Switch: 10. k. i . j (i,j) is the point in (k/2)*(k/2) grid

• Host: 10.pod. Switch.ID ID range is [2, k/2+1] (left to right)

Two-level Routing Table10.0.0.1 10.0.0.2 10.0.0.3 10.4.1.1

10.4.1.2

10.4.2.1

10.4.2.2

10.2.0.2 10.2.0.3

Two-level Routing Table

• Two-level Routing Table Structure

• Two-level Routing Table implementation TCAM=Ternary Content-Addressable MemoryParallel

searchingPriority encoding

Two-level Routing Table---example

• example

Prefix Outgoing Port

10.0.0.0/24 0

10.0.1.1/24 1

0.0.0.0/0

Suffix Outgoing Port

0.0.0.2/8 3

0.0.0.3/8 2


10.0.1.2/32 0

10.0.1.3/32 1

0.0.0.0/0


0.0.0.2/8 2

0.0.0.3/8 3


10.0.0.0/16 0

10.1.0.0/16 1

10.2.0.0/16 2

10.3.0.0/16 3


10.2.0.0/24 0

10.2.1.0/24 1

0.0.0.0/0


0.0.0.2/8 2

0.0.0.3/8 3


10.2.0.2/32 0

10.2.1.3/32 1

0.0.0.0/0


0.0.0.2/8 2

0.0.0.3/8 3

Two-Level Routing Table

• Avoid Packet Reordering

• traffic diffusion occurs in the first half of a packet journey

• Centralized Protocol to Initialize the Routing Table

Flow Classification(Dynamic)

• Soft State(Compatible with Two-Level Routing Table)

• A flow=packet with the same source and destination IP address

• Avoid Reordering of Flow• Balancing

• Assignment and Updating

Flow Classification—Flow Assignment

Hash(Src,Des)

Have seen this hash value?

Lookup previously assign port x

Send packet on port x

Y

Record new flow record f

Assign f to least-loaded port x

Send packet on port x

N

Flow Classification—Update

•

Flow Scheduling

• distribution of transfer times and burst lengths of Internet traffic is long-tailed

• Large flow dominating

• Large flow should be specially handled

Flow Scheduling

▪ Eliminates global congestion▪ Prevent long lived flows from sharing the same

links▪ Assign long lived flows to different links

Edge Switch

Detecting Flow size above a

threshold

Notify the

central controlle

r

Assign this flow to non-conflicting

path

Fault-Tolerance

• Bidirectional Forwarding Detection session (BFD)

• Lower- to Upper-layer Switches

• Upper-layer to Core Switches

• For flow scheduling, it is much more easier to handle.

Failure b/w upper layer and core switches

Outgoing inter-pod traffic:local routing table marks the affected link as unavailable and chooses another core switch

Incoming inter-pod traffic:core switch broadcasts a tag to upper switches directly connected signifying its inability to carry traffic to that entire pod, then upper switches avoid that core switch when assigning flows destined to that pod

Failure b/w lower and upper layer switchesOutgoing inter- and intra pod traffic from lower-layer:

– the local flow classifier sets the cost to infinity and does not assign it any new flows, chooses another upper layer switch

Intra-pod traffic using upper layer switch as intermediary:– Switch broadcasts a tag notifying all lower level

switches, these would check when assigning new flows and avoid it

Inter-pod traffic coming into upper layer switch:– Tag to all its core switches signifying its ability to carry

traffic, core switches mirror this tag to all upper layer switches, then upper switches avoid affected core switch when assigning new flaws

Evaluation

• A strong paper has very extensive evaluation, including testbed implementation based and simulation based evaluation.

• Parts of this section include: methdology, metrics used, results, comparison, etc

Experiment Description—Fat-tree, Click

• 4-port fat-tree, there are 16 hosts, four pods (each with four switches), and four core switches.

• We multiplex these 36 elements onto ten physical machines, interconnected by a 48-port ProCurve 2900 switch with 1 Gigabit Ethernet links.

• Each pod of switches is hosted on one machine; each pod’s hosts are hosted on one machine; and the two remaining machines run two core switches each.

• bandwidth-limited to 96Mbit/s to ensure that the configuration is not CPU limited.

• Each host generates a constant 96Mbit/s of outgoing traffic

Experiment Description—hierarchical tree,click

• four machines running four hosts each, and four machines each running four pod switches with one additional uplink

• The four pod switches are connected to a 4-port core switch running on a dedicated machine.

• 3.6:1 oversubscription on the uplinks from the pod switches to the core switch

• Each host generates a constant 96Mbit/s of outgoing traffic

Result

Power and Heat

End of the Paper

• Conclusions and Future Work:

‣ Summarize your results and any conclusions drawn

‣ Describe briefly the main areas you plan to pursue as future work

Conclusion

• Bandwidth is the scalability bottleneck in large scale clusters

• Existing solutions are expensive and limit cluster size

• Fat-tree topology with scalable routing and backward compatibility with TCP/IP and Ethernet

What Makes A Good Research Paper

• Good methodology/technique

‣ Novel: new and not resembling something formerly known or used

• Good writing

• Publish in good journals/conferences

Paper Reviews

• We will write a series of paper review in this class

• First review due on next Tuesday in class for this paper

• Hand in a hard copy

Paper Review Form• Paper (author, title, complete reference)

• Short 100 word overview description

• Questions:

‣ What problem is the paper addressing?

‣ What is the contribution?

‣ Did you find any drawbacks?

‣ What is your assessment of the overall presentation style of the paper (consistency, clarity, ease of reading)?

‣ Any possible future work or improvements?

Date post:	26-Dec-2015
Category:	Documents
Upload:	silvester-bruce
View:	219 times
Download:	6 times

Before We Start How to read a research paper? How to write a paper review? 1.

Documents