Productize programmable network infrastructure
Yi Tseng MTS, Open Networking Foundation
Live Q&A and polls
Scan me to join live Q&A
Or join from the link: https://bit.ly/osn-days-qa
Outline
• An overview of Aether project • Aether edge P4-based disaggregated UPF • Productize programmable network infrastructure
3
4
Yi TsengMember of Technical Staff Open Networking Foundation
2017: Intern - ONOS - fabric.p4 - M-CORD 2018-now: MTS (PDP Team) - Stratum - Fabric.p4
An overview of Aether project
5
ONF has history of successfully driving disaggregation and SDN
6
ONF has history of successfully driving disaggregation and SDN
6
Packet Switch
App App
SDN Switch
OpenFlow P4Runtime
P4 program
ONOS & OpenFlow & P4
ONF has history of successfully driving disaggregation and SDN
6
Packet Switch
App App
SDN Switch
OpenFlow P4Runtime
P4 program
ONOS & OpenFlow & P4
Broadband / PON
SEBA
ONF has history of successfully driving disaggregation and SDN
6
Packet Switch
App App
SDN Switch
OpenFlow P4Runtime
P4 program
ONOS & OpenFlow & P4
Broadband / PON
SEBA
Optical Transport
ODTN
Disaggregation and virtualization for mobile networks
7
Mobile CoreBase StationRAN
Disaggregation and virtualization for mobile networks
8
Mobile Core UPBase StationRAN
Mobile core control-user plane separation(CUPS)
Disaggregation and virtualization for mobile networks
8
Mobile Core UPBase StationRAN
Mobile Core CP
Mobile core control-user plane separation(CUPS)
Disaggregation and virtualization for mobile networks
9
Mobile Core UP
RU
RAN
Mobile Core CP
Disaggregation and virtualization for mobile networks
9
Mobile Core UP
RU
RAN
Mobile Core CPCU-CP
CU-UPDU
Disaggregation and virtualization for mobile networks
10
Mobile Core UP
Base Station
RAN
Mobile Core CPCU-CP
CU-UPDU
Virtualization
Disaggregation and virtualization for mobile networks
10
Mobile Core UP
Base Station
RAN
Mobile Core CPCU-CP
CU-UPDU
Virtualization
VNFVNF
VNFVNF
VNF
Software-defined networking for mobile networks
11
Mobile Core CP & UPDU & CU-CP & CU-UP
ONOS
Trellis Fabric control
Software-defined networking for mobile networks
11
VNFVNF VNFVNF VNF
Mobile Core CP & UPDU & CU-CP & CU-UP
ONOS
Trellis Fabric control
Stratum
Stratum
Stratum
Stratum
P4Runtime P4Runtime P4Runtime P4Runtime
SDN&VNF infrastructure
Software-defined networking for mobile networks
11
VNFVNF VNFVNF VNF
Mobile Core CP & UPDU & CU-CP & CU-UP
ONOS
Trellis Fabric control
Stratum
Stratum
Stratum
Stratum
P4Runtime P4Runtime P4Runtime P4Runtime
Software-defined networking for mobile networks
11
VNFVNF VNFVNF VNF
Mobile Core CP & UPDU & CU-CP & CU-UP
ONOS
Trellis Fabric control
Stratum
Stratum
Stratum
Stratum
P4Runtime P4Runtime P4Runtime P4Runtime
RAN Control
Software-defined networking for mobile networks
11
VNFVNF VNFVNF VNF
Mobile Core CP & UPDU & CU-CP & CU-UP
ONOS
Trellis Fabric control
Stratum
Stratum
Stratum
Stratum
P4Runtime P4Runtime P4Runtime P4Runtime
RAN Control UPF
UPF.p4
P4-based mobile RAN and core user-planes
Software-defined networking for mobile networks
11
VNFVNF VNFVNF VNF
Mobile Core CP & UPDU & CU-CP & CU-UP
ONOS
Trellis Fabric control
Stratum
Stratum
Stratum
Stratum
P4Runtime P4Runtime P4Runtime P4Runtime
RAN Control UPF
UPF.p4
Distributed cloud for mobile networks
12
Connected Edge
Open RAN Controller
Edge Apps IoT AI/ML Platform(s) Mobile Core User Plane (P4 UPF)
Small Cell
Small Cell
CBRS or Licensed Band
Distributed cloud for mobile networks
12
Connected Edge Central Cloud
Open RAN Controller
Edge Apps IoT AI/ML Platform(s) Mobile Core User Plane (P4 UPF)
Small Cell
Small Cell
CBRS or Licensed Band
Distributed cloud for mobile networks
12
Connected Edge Central Cloud
Open RAN Controller
Edge Apps IoT AI/ML Platform(s) Mobile Core User Plane (P4 UPF)
Small Cell
Small Cell
CBRS or Licensed Band
Distributed cloud for mobile networks
12
Connected Edge Central Cloud
Open RAN Controller
Edge Apps IoT AI/ML Platform(s) Mobile Core User Plane (P4 UPF)
Small Cell
Small Cell
CBRS or Licensed Band
Aether Management
Platform
Distributed cloud for mobile networks
12
Connected Edge Central Cloud
Open RAN Controller
Edge Apps IoT AI/ML Platform(s) Mobile Core User Plane (P4 UPF)
Small Cell
Small Cell
CBRS or Licensed Band
Aether Management
Platform
Enterprise Control Portal
Distributed cloud for mobile networks
12
Connected Edge Central Cloud
Open RAN Controller
Edge Apps IoT AI/ML Platform(s) Mobile Core User Plane (P4 UPF)
Small Cell
Small Cell
CBRS or Licensed Band
Aether Management
Platform
Enterprise Control Portal
Distributed cloud for mobile networks
12
Connected Edge Central Cloud
Open RAN Controller
Edge Apps IoT AI/ML Platform(s) Mobile Core User Plane (P4 UPF)
Small Cell
Small Cell
CBRS or Licensed Band
Aether Connectivity
Platform
Aether Management
Platform
Enterprise Control Portal
Distributed cloud for mobile networks
12
Connected Edge Central Cloud
Open RAN Controller
Edge Apps IoT AI/ML Platform(s) Mobile Core User Plane (P4 UPF)
Small Cell
Small Cell
CBRS or Licensed Band
Aether Connectivity
Platform
Central IoT AI/ML Apps
Aether Management
Platform
Enterprise Control Portal
Distributed cloud for mobile networks
12
Connected Edge Central Cloud
Open RAN Controller
Edge Apps IoT AI/ML Platform(s) Mobile Core User Plane (P4 UPF)
Small Cell
Small Cell
CBRS or Licensed Band
Aether Connectivity
Platform
Central IoT AI/ML Apps
Aether Management
Platform
Enterprise Control Portal
Distributed cloud for mobile networks
12
Connected Edge Central Cloud
Open RAN Controller
Edge Apps IoT AI/ML Platform(s) Mobile Core User Plane (P4 UPF)
Small Cell
Small Cell
CBRS or Licensed Band
Aether Connectivity
Platform
Central IoT AI/ML Apps
Aether Management
Platform
Enterprise Control Portal
Distributed Mobile Core Use Plane Provides local breakout at all
remote Aether Edge site
Shared Mobile Core Control Plane in central cloud Supports all Aether Edge sites
Aether has been operational since December’19
13
Aether Edge P4-based Disaggregated UPF
14
A disaggregated UPF
15
SMF/SPGW-C
Control
Fast-path
PFCP
gRPC/P4Runtime
UPF / SPGW-U
Combine Fast-paths
16
SMF/SPGW-C
Control
HW Fast-path (Tofino+FPGA)
PFCP
gRPC/P4Runtime
UPF / SPGW-U
SW Fast-path (BESS)Higher tput (1-10s Tbit/s)
Lower latency (100s ns) Smaller memory (10s MB)
Lower tput (10-100s Gbit/s) Higher latency (100s µs) Larger memory (100s GB)
Benefits in leveraging both fast-paths in the same UPF!
Example: Tesla factory
• Requirement: 1M UEs • 5% smart phone • 10% wideband IoT devices (e.g., HD cameras) • 85% narrowband IoT devices (e.g., low data sensors)
• Solution • HW fast-path
• Smartphone + wideband IoT: 150K sessions • SW fast-path
• Narrowband IoT: 850K sessions
17
Architecture
18
SMF/SPGW-C
PFCP Agent (control)
ONOS
UP4 App Tofino, FPGA, D-BUF
Trellis App RIB, mcast, etc.
BESS
Stratum
Stratum D-BUF
NIC
FPGA NIC (hqos.p4)
NIC
Tofino Switch (fabric.p4)
SW
HW
DNBase
Station
External routing (BGP, OSPF, etc)
P4Runtime
P4Runtime gNMI
gRPC
SW Path
HW Path
UPF / SPGW-U
Architecture
18
SMF/SPGW-C
PFCP Agent (control)
ONOS
UP4 App Tofino, FPGA, D-BUF
Trellis App RIB, mcast, etc.
BESS
Stratum
Stratum D-BUF
NIC
FPGA NIC (hqos.p4)
NIC
Tofino Switch (fabric.p4)
SW
HW
DNBase
Station
External routing (BGP, OSPF, etc)
P4Runtime
P4Runtime gNMI
gRPC
SW Path
HW Path
Aware fast-paths UPF / SPGW-U
Architecture
18
SMF/SPGW-C
PFCP Agent (control)
ONOS
UP4 App Tofino, FPGA, D-BUF
Trellis App RIB, mcast, etc.
BESS
Stratum
Stratum D-BUF
NIC
FPGA NIC (hqos.p4)
NIC
Tofino Switch (fabric.p4)
SW
HW
DNBase
Station
External routing (BGP, OSPF, etc)
P4Runtime
P4Runtime gNMI
gRPC
SW Path
HW Path
Aware fast-paths UPF / SPGW-ULogical P4 pipeline, physically realized with Tofino+FPGA+DBUF
Architecture
18
SMF/SPGW-C
PFCP Agent (control)
ONOS
UP4 App Tofino, FPGA, D-BUF
Trellis App RIB, mcast, etc.
BESS
Stratum
Stratum D-BUF
NIC
FPGA NIC (hqos.p4)
NIC
Tofino Switch (fabric.p4)
SW
HW
DNBase
Station
External routing (BGP, OSPF, etc)
P4Runtime
P4Runtime gNMI
gRPC
SW Path
HW Path
Aware fast-paths UPF / SPGW-ULogical P4 pipeline, physically realized with Tofino+FPGA+DBUF
Holds downlink packets in memory during UE power save mode. Can run on top of FPGA NIC
Architecture
18
SMF/SPGW-C
PFCP Agent (control)
ONOS
UP4 App Tofino, FPGA, D-BUF
Trellis App RIB, mcast, etc.
BESS
Stratum
Stratum D-BUF
NIC
FPGA NIC (hqos.p4)
NIC
Tofino Switch (fabric.p4)
SW
HW
DNBase
Station
External routing (BGP, OSPF, etc)
P4Runtime
P4Runtime gNMI
gRPC
SW Path
HW Path
Aware fast-paths UPF / SPGW-ULogical P4 pipeline, physically realized with Tofino+FPGA+DBUF
Holds downlink packets in memory during UE power save mode. Can run on top of FPGA NIC
Optional, for advance hierarchical QoS. Can rely on Tofino for simper QoS
Architecture
18
SMF/SPGW-C
PFCP Agent (control)
ONOS
UP4 App Tofino, FPGA, D-BUF
Trellis App RIB, mcast, etc.
BESS
Stratum
Stratum D-BUF
NIC
FPGA NIC (hqos.p4)
NIC
Tofino Switch (fabric.p4)
SW
HW
DNBase
Station
External routing (BGP, OSPF, etc)
P4Runtime
P4Runtime gNMI
gRPC
SW Path
HW Path
Aware fast-paths UPF / SPGW-ULogical P4 pipeline, physically realized with Tofino+FPGA+DBUF
Holds downlink packets in memory during UE power save mode. Can run on top of FPGA NIC
Optional, for advance hierarchical QoS. Can rely on Tofino for simper QoS
Full UPF pipeline for low data rate sessions
Architecture
18
SMF/SPGW-C
PFCP Agent (control)
ONOS
UP4 App Tofino, FPGA, D-BUF
Trellis App RIB, mcast, etc.
BESS
Stratum
Stratum D-BUF
NIC
FPGA NIC (hqos.p4)
NIC
Tofino Switch (fabric.p4)
SW
HW
DNBase
Station
External routing (BGP, OSPF, etc)
P4Runtime
P4Runtime gNMI
gRPC
SW Path
HW Path
Aware fast-paths UPF / SPGW-ULogical P4 pipeline, physically realized with Tofino+FPGA+DBUF
Holds downlink packets in memory during UE power save mode. Can run on top of FPGA NIC
Optional, for advance hierarchical QoS. Can rely on Tofino for simper QoS
Full UPF pipeline for low data rate sessions
Detour based on IP pools/prefixes.
Status (as of December 2020)
• Already started rolling out at several Aether Edge sites: • GTP termination and accounting on Tofino, integrated with Trellis/ONOS fabric
control
• Q1 2021 • Downlink buffering via DBUF • QoS on Tofino with shared queues • Improved scale: 10k UEs
• Q4 2021 • Integration with FPGA NIC for advance QoS • Scale improvements
• Long-term • Integration with SW-based fast-path (BESS)
19
Productize programmable network infrastructure
20
Software stack
21
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Improved, optimized software stack
22
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Improved, optimized software stack
22
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Fabric-TNA
Improved, optimized software stack
22
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Fabric-TNA
Rewrite fabric.p4 from v1model architecture to Tofino Native Architecture(TNA). Allows us to create more advance and optimized P4 program.
Improved, optimized software stack
22
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Fabric-TNA
Improved, optimized software stack
22
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Fabric-TNA Decoupled from the ONOS code-base with new release cycle.
Improved, optimized software stack
22
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Fabric-TNA
Improved, optimized software stack
22
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Fabric-TNA
Use new ONOS LTS with lots of stability, performance, and availability improvements
Improved, optimized software stack
22
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Fabric-TNA
Improved, optimized software stack
22
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Fabric-TNA
Stratum-bfrt
Improved, optimized software stack
22
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Fabric-TNA
Stratum-bfrt New Stratum implementation based on Barefoot native API unlocks more advance ASIC management.
Improved, optimized software stack
22
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Fabric-TNA
Stratum-bfrt
Improved, optimized software stack
22
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Fabric-TNA
Stratum-bfrtSeveral improvements to support fast deployment and troubleshooting.
Improved, optimized software stack
22
Trellis
ONOS
Fabric.p4
Stratum
Open Network Linux
Barefoot SDE
UP4
Control plane software
Data plane software
Fabric-TNA
Stratum-bfrt
Fabric-TNA
• ONF’s fabric.p4 on Tofino Native Architecture(TNA) • Supports Aether Edge use-cases
• Trellis (Bridging, Routing, …) • UPF/SPGW-U
• Simple QoS, accounting • Integrate with D-BUF
• Inband Network Telemetry (INT) • Advance telemetry report mechanism
23
Stratum-bfrt
• Stratum implementation with Barefoot BfRt C++ API
• Performance improvement • Advance ASIC control
• Batching/Transaction • Register • Traffic manager • Egress mirroring • Folded/Multi pipeline • …
24
Software packages
25
Trellis
ONOS
Fabric TNA
Stratum
Open Network Linux
Barefoot SDE
UP4
TOST container image
Stratum container image
ONIE installer
TOST: Trellis ONOS Stratum Tofino
Kubernetes Integration
26
Tofino Switch
Terraform Rancher
Management Node
Kubernetes Integration
26
Tofino Switch
DHCPHTTP TFTP
Terraform Rancher
Management Node
Kubernetes Integration
26
Tofino Switch
DHCPHTTP TFTP
Docker
Terraform Rancher
Open Network Linux Management Node
Kubernetes Integration
26
Tofino Switch
DHCPHTTP TFTP
Docker
Kubernetes Terraform Rancher
Open Network Linux Management Node
Kubernetes Integration
26
Tofino Switch
DHCPHTTP TFTP
Docker
Kubernetes Terraform Rancher
StratumPrometheus Exporter(s)
Open Network Linux Management Node
Kubernetes Integration
26
Tofino Switch
• Tofino switch as a Kubernetes worker node ◦ With special taint and label to make sure only Stratum is deployed on it
• Stratum is deployed as Kubernetes service ◦ Deployed as DaemonSet. There will be one and only one instance on each switch node ◦ P4RT/gNMI exposed via NodePort ◦ externalTrafficPolicy=Local so the traffic won’t get load-balanced to other switches
DHCPHTTP TFTP
Docker
Kubernetes Terraform Rancher
StratumPrometheus Exporter(s)
Open Network Linux Management Node
Automation - Build and Release
• Git-triggered automated build and release process for Trellis apps and control plane container image
• Build and release Stratum image weekly
27
TOST repo
Container Registry
Stratum repo
Automation - Build and Release
• Git-triggered automated build and release process for Trellis apps and control plane container image
• Build and release Stratum image weekly
27
TOST repo
Pre-merge checks
Container Registry
Stratum repo
Submit review
Automation - Build and Release
• Git-triggered automated build and release process for Trellis apps and control plane container image
• Build and release Stratum image weekly
27
TOST repo
Pre-merge checks
Check version
Container Registry
Stratum repo
Submit review Merge
Automation - Build and Release
• Git-triggered automated build and release process for Trellis apps and control plane container image
• Build and release Stratum image weekly
27
TOST repo
Pre-merge checks
Check version
Build image Container Registry
Stratum repo
Submit review Merge Snapshot Master image
{ONOS, Trellis, UP4, Fabric-TNA…
Automation - Build and Release
• Git-triggered automated build and release process for Trellis apps and control plane container image
• Build and release Stratum image weekly
27
TOST repo
Pre-merge checks
Check version
Release
Build image Container Registry
Stratum repo
Submit review Merge
Version x.y.z
Snapshot Master image
{ONOS, Trellis, UP4, Fabric-TNA…
Automation - Build and Release
• Git-triggered automated build and release process for Trellis apps and control plane container image
• Build and release Stratum image weekly
27
TOST repo
Pre-merge checks
Check version
Release
Build image Container Registry
Stratum repo
Submit review Merge
Version x.y.z
Snapshot
Add new Git Tag
Master image
{ONOS, Trellis, UP4, Fabric-TNA…
Automation - Build and Release
• Git-triggered automated build and release process for Trellis apps and control plane container image
• Build and release Stratum image weekly
27
TOST repo
Pre-merge checks
Check version
Release
Build image Container Registry
Stratum repo
Submit review Merge
Version x.y.z
Snapshot
Add new Git Tag
Master image
Release image
{ONOS, Trellis, UP4, Fabric-TNA…
Automation - Build and Release
• Git-triggered automated build and release process for Trellis apps and control plane container image
• Build and release Stratum image weekly
27
TOST repo
Pre-merge checks
Check version
Release
Build image Container Registry
Stratum repo
Submit review Merge
Version x.y.z
Snapshot
Add new Git Tag
Master image
Release image
Master image
Build weekly
{ONOS, Trellis, UP4, Fabric-TNA…
Automation - Deploy
• Human-triggered Jenkins pipeline based on Terraform • Explicitly-defined helm chart version • Get rid of issues seen in Rancher CLI
28
Aether Pod
Config
Automation - Deploy
• Human-triggered Jenkins pipeline based on Terraform • Explicitly-defined helm chart version • Get rid of issues seen in Rancher CLI
28
Aether Pod
Config
Submit Review
Automation - Deploy
• Human-triggered Jenkins pipeline based on Terraform • Explicitly-defined helm chart version • Get rid of issues seen in Rancher CLI
28
Aether Pod
Config
Pre-merge checks
Submit Review
Automation - Deploy
• Human-triggered Jenkins pipeline based on Terraform • Explicitly-defined helm chart version • Get rid of issues seen in Rancher CLI
28
Aether Pod
Config
Pre-merge checks
Submit Review Merge
Automation - Deploy
• Human-triggered Jenkins pipeline based on Terraform • Explicitly-defined helm chart version • Get rid of issues seen in Rancher CLI
28
Aether Pod
Config
Pre-merge checks
Submit Review Merge Trigger
Automation - Deploy
• Human-triggered Jenkins pipeline based on Terraform • Explicitly-defined helm chart version • Get rid of issues seen in Rancher CLI
28
Aether Pod
Config
DeployPre-merge
checks
{Helm chart, container images, …
Submit Review Merge Trigger
Automation - Deploy
• Human-triggered Jenkins pipeline based on Terraform • Explicitly-defined helm chart version • Get rid of issues seen in Rancher CLI
28
Aether Pod
Config
Development
Staging
Production
DeployPre-merge
checks
{Helm chart, container images, …
Submit Review Merge Trigger
Recap
• Aether - 5G/LTE Enterprise Private Edge Cloud • P4-based disaggregated UPF • Highly automated network infrastructure
29
Learn More
• Aether • 5G/LTE Enterprise Private Edge Cloud • https://aetherproject.org
• Trellis • Leaf-spine SDN fabric for edge • https://opennetworking.org/trellis
• Stratum • Silicon-independent switch operating system for SDN • https://stratumproject.org
• Slack Channel: onf-community
30
31
Thank you