+ All Categories
Home > Documents > Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from...

Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from...

Date post: 28-Dec-2015
Category:
Upload: jemima-bishop
View: 217 times
Download: 0 times
Share this document with a friend
Popular Tags:
14
Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1
Transcript
Page 1: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

1

Network Stack Spe-cialization

for PerformancePresented by Donghwi Kim

(Some figures are brought from the paper)

Page 2: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

2

Objective

• The authors tried to show upper bound of network application performance by specialization(Actually, not only a network stack but also an ap-plication’s implementation is specialized)

• A special kind of applications is chosen(Serves same content to multiple users)• Sandstorm: A Web server serves static webpage• Namestorm: A DNS server

Page 3: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

3

Key of performance

• A complete zero-copy stack• Aggressive amortization• Pre-packetized data• Batching to mitigate system-call overhead

• Synchronous, clocked from received packets• Improves cache locality• Minimize the latency of sending the first packet of re-

sponse

• Intel’s DDIO

Page 4: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

4

Network stack

• libnmio: Data-movement and event-notification primitives• libeth: A lightweight Eth-

ernet-layer• libtcpip: An optimized

TCP/IP layer• libudpip: A UDP/IP layer

Page 5: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

5

A complete zero-copy stack• Receiving a packet• Done by DMA

• Transmitting a packet• Aggressive amortization

• Modify one of prepared a copy of packet and use DMA• The modifications are performed in a single pass to use

CPU’s L1 cache efficiently

Page 6: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

6

A complete zero-copy stack• pre-copy method• maintain more than one copy of each packet• potential to thrash CPU’s L3 cache

• memcpy method• maintain one long-term copy and create ephemeral

copies• more work should be done

Page 7: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

7

How the optimization works?

• Batching increases TCP RTT• Amortizing reduces per-request processing

Page 8: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

8

Intel’s DDIO

• Direct Data I/O

• When transmission• Pull data from the L3 cache without a detour through

system memory

• When reception• DMA can place data in processor’s L3 cache

Page 9: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

9

Evaluation

Page 10: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

10

Evaluation

Page 11: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

11

Evaluation

Page 12: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

12

DDIO

• Pre-copy case: DDIO pulls untouched incoming data into the cache, so the file data cannot be cached• Memcopy case: CPU loads file data into the cache

Page 13: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

13

Discussion

• mTCP vs. Sandstorm

Page 14: Network Stack Specialization for Performance Presented by Donghwi Kim (Some figures are brought from the paper) 1.

14

Discussion

• mTCP• Provides UNIX-like socket programming interface• mTCP provides fairness

• TCP of Sandstorm• Higher level stack does not wrap lower level stack

• Each stack is a stand-alone service• For example, an application interacts directly with libnmio

• Amortization, no-queueing, inaccurate timer cannot guarantee correctness• Limited applications


Recommended