+ All Categories
Home > Documents > Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating...

Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating...

Date post: 01-Aug-2018
Category:
Upload: lytram
View: 226 times
Download: 0 times
Share this document with a friend
26
Hardware accelerating Linux network functions Roopa Prabhu, Wilson Kok Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada
Transcript
Page 1: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

Hardware accelerating Linux network functions

Roopa Prabhu, Wilson Kok

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 2: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

Agenda● Recap: offload models, offload drivers● Introduction to switch asic hardware● L2 offload to switch ASIC

○ Mac Learning, ageing○ stp handling○ igmp snooping○ vxlan

● L3 offload to switch ASIC

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 3: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

Offload models ...

NIC1

kernel

port1

bridge

port2

rtnetlink api:bridge vlan addbridge fdb add

NIC1

port3

port2port1

NIC2

port2port1

bridge

switch asic

CPU MEMFDB

port4

kernel

bridge

port2 portnport1

port1 port2 portnport1

● Single consistent netlink based UAPI

● Single kernel offload API to offload to variety of hardware (nics, switch asics, ..)

FDB (in sync with hw)FDB

FDB

Rtnetlink API PATH

Offload API path

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 4: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

user

kernel

kernel

iproute2

quagga

mstpd

bridge

brctl

tc

nftables

Routing Tables ARP Tables

Bridge FDB/MDB

Netfilter Tables

Bonds Bridges VXLAN

HW

swp1 swpN

The bigger picture...

hw driver

CPU

bird

MEM

OVSdb

snmpd

lldpd

tc

Routing Tables ARP Tables Bridge

FDB/MDB acls

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 5: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

user

kernel

kernel

Bridge br0FDB/MDB

HW

swpN

HW offload driver (kernel)

netdev_ops { .ndo_fdb_add/del .ndo_fib_add/del}

hw driver

CPU ASIC MEM

br0

swp1

switch ports

swp2

FIB

routing daemon

mstp

RTnetlink API

HWCPU MEMRouting

Tables ARP Tables Bridge FDB/MDB acls

switchdev offload API

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 6: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

user

kernel

kernel

Bridge br0FDB/MDB

HW

swpN

HW offload driver (user space)

hw driver

CPU ASIC MEM

br0

swp1

rtnetlink listener

swp2

FIB

routing daemon

mstp

HWCPU MEMRouting

Tables ARP Tables Bridge FDB/MDB acls

switch ports

RtNetlink notifications

rtnetlink API

HWCPU ASIC MEM

HWCPU MEMRouting

Tables ARP Tables Bridge FDB/MDB acls

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 7: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

kernel

switch hardware

switch hardware

netdevs for each front panel ports

cpu port

front panel ports

switch driver

swp1 swp2 swp3swpn

1 2 3 n

switch driver:

● Creates netdevs for front panel ports

● Port netdevs only see traffic forwarded to the CPU port

● Sets hardware offload flagNETIF_F_HW_SWITCH_OFFLOAD

on netdevs

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 8: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

ip link show switch ports

# ip link show

1: lo: <LOOPBACK> mtu 16436 qdisc noqueue state DOWN mode DEFAULT

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000

link/ether 00:e0:ec:27:4e:b6 brd ff:ff:ff:ff:ff:ff

3: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 500

link/ether 44:38:39:00:27:ac brd ff:ff:ff:ff:ff:ff

4: swp2: <BROADCAST,MULTICAST> mtu 9000 qdisc pfifo_fast state DOWN mode DEFAULT qlen 500

link/ether 00:e0:ec:27:4e:b8 brd ff:ff:ff:ff:ff:ff

[snip]

55: swp53: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 500

link/ether 00:e0:ec:27:4e:f7 brd ff:ff:ff:ff:ff:ff

56: swp54s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 500

link/ether 00:e0:ec:27:4e:fb brd ff:ff:ff:ff:ff:ff

57: swp54s1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 500

link/ether 00:e0:ec:27:4e:fc brd ff:ff:ff:ff:ff:ff

58: swp54s2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 500

link/ether 00:e0:ec:27:4e:fd brd ff:ff:ff:ff:ff:ff

59: swp54s3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 500

link/ether 00:e0:ec:27:4e:fe brd ff:ff:ff:ff:ff:ff

management port

switch portsProceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 9: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

ethtool on switch port$ethtool swp1Settings for swp1:

Supported ports: [ FIBRE ]Supported link modes: 1000baseT/Full 10000baseT/Full Supported pause frame use: Symmetric

Receive-onlySupports auto-negotiation: YesAdvertised link modes: 1000baseT/Full Advertised pause frame use: NoAdvertised auto-negotiation: NoSpeed: 10000Mb/sDuplex: FullPort: FIBREPHYAD: 0

Transceiver: externalAuto-negotiation: offCurrent message level: 0x00000000

(0)Link detected: yes

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 10: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

Creating a hardware accelerated Linux bridge device

# ip link add br0 type bridge

# ip link set dev swp1 master br0

# ip link set dev swp2 master br0

# bridge vlan add vid 10-20 dev swp1

# bridge vlan add vid 20-30 dev swp2

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 11: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

Bonds as bridge ports

NIC1

bridge

switch asic

CPU MEMFDB

kernel

bridge

port2 portnport1 portn-1

bond0

port1 port2 portn-1 portn

FDB (in sync with hw)

rtnetlink api:bridge vlan addbridge fdb add

LAGbond0 (portn-1,

portn

switchdev offload API

rtnetlink API

bonding driver

● switch ASICS support Link aggregation

● bonding driver LAG config is offloaded to the switch ASIC

● fdb and vlan offloads go through the bonding driver

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 12: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

switch asic

VLAN

Bridging hardware offload: packet pathkernel

swp1

bridge

swp2

swp1 swp2

known unicast (transit)

BUM*

system generated/destined to system

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 13: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

Bridging hardware offload: packet path

● Known unicast traffic not destined to system is forwarded only in hardware

● BUM traffic is forwarded in hardware plus a copy MAY be sent to kernel

● BUM traffic in kernel should not be forwarded again (duplicate copies from hardware and software)

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 14: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

Bridging hardware offload: fdb learn

user

kernel

kernel

Bridge br0FDB/MDB

HW

swp1 swpN

switch driver

CPU ASIC MEM

hw events: learn/move

br0

fdb add/update

swp2rtnetlink

notification

00:11:22:33:44:55vlan 10intf_id 9876

00:11:22:33:44:55br0swp2

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 15: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

Bridging hardware offload: learning in HW

● Turn off learning in bridge driver● switch driver listens to learn notifications from hardware● converts hardware interface id and vlan to kernel ifindex of bridge

port (and vlan) and bridge● sends netlink fdb update to kernel (userspace driver) or calls bridge

driver learn sync switchdev API (kernel driver)

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 16: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

Bridging hardware offload: kernel ageing

user

kernel

kernel

Bridge br0FDB/MDB

HW

swp1 swpN

switch driver

CPU ASIC MEM

br0

fdb update

swp2rtnetlink

get fdb hit status

fdb delete

fdb delete

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 17: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

Bridging hardware offload: hardware ageing

user

kernel

kernel

Bridge br0FDB/MDB

HW

swp1 swpN

switch driver

CPU ASIC MEM

br0

fdb delete

swp2rtnetlink

fdb delete

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 18: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

Bridging hardware offload: ageingBridge driver very seldom sees packets with hardware offload. FDB age is not up to date.Hardware ageing● bridge driver should not do ageing if hardware is doing it● fdb show will need to get age from hardware during ‘show’, or need

periodic age update from switch driverKernel ageing● definitely need periodic age update from switch driver

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 19: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

STP offloadSTP● bridge driver maintains STP states (either kernel STP or

userspace STP)● bridge driver communicates STP states to switch driver

using switchdev offload API● OR a switch driver in userspace can listen to STP state

notifications to update HW state

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 20: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

switch asic

IGMP snooping offloadkernel

swp1

bridge

swp2

swp1 swp2

report

querydata

QueryJoin 224.1.2.3 224.1.2.3

dev bridge port swp1 grp 224.1.2.3 temp

router ports on bridge: swp2

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 21: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

IGMP snooping offload● switch driver configures hardware to send IGMP reports

and queries to software● bridge driver maintains IGMP group membership● in some cases the reports or queries need to be re-

forwarded in the kernel

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 22: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

VXLAN offload - hardware vtep

swp1

bridge

swp2

swp3

MAC Interface

macA swp1

macB swp2

macC vxlan100

MAC Destination

macC 172.16.21.150

unknown 172.16.22.125

macA macB

macC

lo: 172.16.20.103 vxlan100

172.16.21.150

20.0.0.3 20.0.0.5

20.0.0.2

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 23: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

VXLAN offload - hardware vtepModel● VXLAN link as bridge port

○ bridging between local ports○ VXLAN tunneling for remote MACs

● BUM traffic handling○ multicast○ using off-system replicator

■ could have a list of redundant replicators, need to choose ONE out of the list of remote dests (per flow or per vni etc.)

○ self replication■ vtep sends to a list of remote vteps, need to choose ALL of the list of

remote dests

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 24: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

VXLAN offload - ovsdb integrationAgent to translate ovsdb schema objects to kernel constructs.

OVSDB Linux kernel

logical switch vxlan link + bridge

physical switch tunnel_ip vxlan link local ip

logical port binding bridge member port, vlan

unicast remote mac + physical locator bridge fdb (mac, vlan, dst <remote ip>)

mcast remote mac “unknown” + physical locator list

vxlan link default dest

unicast local mac + physical locator bridge fdb (mac, vlan, local dev)

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 25: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

user

kernel

kernel

FIB

HW

swp1 swpN

l3 offloads

switch driver

CPU ASIC MEM

swp2

ip route add 1.1.1.1/32 nexthop via 192.168.200.3 nexthop via 192.168.200.4

Routing Tables Neigh tables

Quagga/Bird

rtnetlink API pathiproute

Network manager

offload API path

neigh table

arping for unresolved

nexthop

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada

Page 26: Roopa Prabhu, Wilson Kok functions Hardware accelerating Linux network · Hardware accelerating Linux network functions Roopa Prabhu, ... Agenda Recap: offload models, offload drivers

l3 hardware offload

● Routes via routing daemons go to the kernel● Unresolved next hops, point to CPU in HW● switch driver tries to resolve them by probes

(arping)● Refresh neigh entries for pkts routed through

hardware (hit bit provided by hardware)

Proceedings of netdev 0.1, Feb 14-17, 2015, Ottawa, On, Canada


Recommended