IXP Lab 2012: Part 1

Post on 04-Jan-2016

43 views 0 download

Tags:

description

IXP Lab 2012: Part 1. Network Processor Brief. Outline. Network Processor Intel IXP2400 Processing Element Register Memory Interface IXP Programming Language Programming Model Programming Syntax. Router Development (1). Software Based General Purpose Processor Flexible - PowerPoint PPT Presentation

transcript

IXP Lab 2012: Part 1

Network Processor Brief

NCKU CSIE CIAL Lab 2

Outline Network Processor Intel IXP2400

Processing Element Register Memory Interface

IXP Programming Language Programming Model Programming Syntax

NCKU CSIE CIAL Lab 3

Router Development (1)

Software Based General Purpose Processor

Flexible Poor Performance …

Hardware Based ASIC

Best Performance Long Development Time

NCKU CSIE CIAL Lab 4

Router Development (2)

Network Processor (NPU) Based Balance of both How ?

Parallel processors Multi-threaded cores Programmable processors with

nonprogrammble copressors

NCKU CSIE CIAL Lab 5

Network Processor Overview

For high speed packet processing Comprise Multi-Cores for Parallel

executing Multi-Threaded Core Reduced Instruction Set Multiple Memory Interfaces

NCKU CSIE CIAL Lab 6

Hierarchical Layer Data-Plane

Fast-Path Slow-Path

Control-Plane Routing Protocol

Management-Plane Monitor Applications User Interface

NCKU CSIE CIAL Lab 7

Data-Plane

Fast-Path General Packet Handling As fast as possible

Slow-Path Exception Packet Handling

Packet with options Local TCP/IP Stack

NCKU CSIE CIAL Lab 8

Internet eXchange Processor First Generation

IXP1200, IXP1240, IXP1250 Second Generation

IXP2400, IXP2800, IXP2850 IXP2805, IXP2855

Others IXP4XX

NCKU CSIE CIAL Lab 9

Network Flow Processor

By Netronome From Intel IXP2XXX NFP-3240, NFP-3216

NCKU CSIE CIAL Lab 11

Intel IXP2400 Block Diagram

NCKU CSIE CIAL Lab 12

IXP2400 Overview

Functional Block Processing Element Memory Interfaces Coprocessors Other Interfaces

Hierarchical View

NCKU CSIE CIAL Lab 13

Processing Element

Programmability Hierarchical Processing Elements

XScale Microengine (ME)

NCKU CSIE CIAL Lab 14

XScale

RISC based processor (ARMV5TE) Real-time OS

Montavista Linux ME Management

Control ME execution Resource Management

NCKU CSIE CIAL Lab 15

MicroEngine (1)

Eight MEs per IXP2400 (work in parallel)

Eight Threads per ME Instruction set of ME are reduced

for packet processing only Not as powerful as general processor No floating point related instructions No divide instruction

NCKU CSIE CIAL Lab 16

MicroEngine (2)

No OS Not interactive Managed by XScale

Code Store (4K Instrcutions) Executing

NCKU CSIE CIAL Lab 17

MicroEngine Threads

Concurrent Executing No Preemptive Round Robin Executing Each thread own its private set of

registers Zero-Overhead Context Switching

NCKU CSIE CIAL Lab 18

Registers of ME 256 GPRs 256 SRAM Transfer Registers

128 Read 128 Write

256 DRAM Transfer Registers 128 Read 128 Write

128 Next Neighbor Registers

NCKU CSIE CIAL Lab 19

Context Switch

Content of registers needs not be swap-out and swap-in during context switching

With the mechanism, another thread can swap in and doing some useful task to cover the long latency when the previous thread has swapped out for issues a memory request

NCKU CSIE CIAL Lab 20

Memory Interface of IXP2400 Local Memory

Smallest and Fastest Scratchpad

Passing handle of the packet SRAM

Hold data structure for packet processing DRAM

Largest and Slowest Hold packet’s content

NCKU CSIE CIAL Lab 21

Local Memory Per ME Private to Other MEs Private to XScale Size: 2560 Bytes (640 LWs) Usage

Variable Spilling Caching

Latency: 3 cycles

NCKU CSIE CIAL Lab 22

Scratchpad

On-Chip Memory Shared by all MEs Size: 16KB (Fixed) Usage:

Scratchpad Scratch Ring (Hardware FIFO)

Latency: ~60 cycles

NCKU CSIE CIAL Lab 23

SRAM Off-Chip Memory Shared by all MEs (2-channels) Size: 64 MB (Per Channel at

Maximum) Usage:

Hardware FIFO Hold data structure Hold Meta-data of packets

Latency: ~90 cycles

NCKU CSIE CIAL Lab 24

DRAM

Off-Chip Memory Shared by all MEs (1-channels) Size: 1 GB (at Maximum) Usage:

Hold whole packet contents Alternative space for data structure

Latency: ~120 cycles

NCKU CSIE CIAL Lab 25

Coprocessor MSF (Media Switch Fabric)

Receive Packet to DRAM Transmit Packet from DRAM

SHaC Scratchpad Hash Unit CAP

NCKU CSIE CIAL Lab 26

Packet META-DATA (1)

Data for processing packets How to identify packet?

Packet Handle Packet Temporal Information

Non-related to packet content Meta-data

Input Port, Output Port Info for Packet Address in DRAM

NCKU CSIE CIAL Lab 27

Packet META-DATA (2)

How to pass these info between ME? Hardware FIFO

Scratch Ring SRAM Ring Next-Neighbor Ring

Issues

NCKU CSIE CIAL Lab 28

Hierarchical View (Setting #1) Only one IXP2400 based board Data-Plane

Fast-Path: Microengine Slow-Path: XScale

Control-Plane XScale

Management-Plane XScale

NCKU CSIE CIAL Lab 29

Hierarchical View (Setting #2) Multiple IXP2400 based boards Data-Plane

Fast-Path: Microengine Slow-Path: XScale

Control-Plane CPU

Management-Plane CPU

NCKU CSIE CIAL Lab 30

Programming IXP2400

XScale Programming with C

Microengine Programming with MicroC or

Microcode We will focus on this part !

NCKU CSIE CIAL Lab 31

IDE Tool--IXA SDK Workbench

NCKU CSIE CIAL Lab 32

ME Language

MicroC Subset of ANSI C Only limited part of standard C

libraries are implemented Intrinsic Library for supporting

operations of IXP Microcode

High level of assembly

NCKU CSIE CIAL Lab 33

Programming Model (1)

Receive – Processing – Transmit Intel has provided sample code for

receive and transmit. We only focus on the part of

processing.

RX PROCESSING TX

NCKU CSIE CIAL Lab 34

Programming Model (2)

Processing ME Pipeline Model Parallel Model Mixed Model

RX PROCESSING TX

NCKU CSIE CIAL Lab 35

Pipeline Model

RX TXPROC #1 RPOC #2

•Control the whole resource of ME

•Hard to balance between different stage

NCKU CSIE CIAL Lab 36

Parallel Model

RX TX

PROC #1

RPOC #2

•Balance is easy

•Higher Performance

•Resource is limited

NCKU CSIE CIAL Lab 37

Mixed Model

RX TX

PROC #1

RPOC #2

PROC #3

NCKU CSIE CIAL Lab 38

MicroC Example 1 (1)void main () {

_declspec(shared sram) int old_array[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };_declspec(shared sram) int new_array[sizeof(old_array)/sizeof(int)];

global_label("start_reverse");reverse_array(old_array, new_array,

sizeof(old_array)/sizeof(int));global_label("end_reverse");

}

NCKU CSIE CIAL Lab 39

MicroC Example 1 (2)

void reverse_array(volatile int* old, volatile int* new, int

size) { int index = 0;

for (index = 0; index < size; index++) {new[index] = old[size - index - 1];

}}

NCKU CSIE CIAL Lab 40

MicroC Example 2

sram_read(&sram_egt_dim1_2_node, (__declspec(sram) unsigned int *)(PACKET_CLASSIFICATION_SRAM_BASE1 + current*8), 2, sig_done, &sram_read_sig_dim1_2);

__wait_for_all(&sram_read_sig_dim1_2);temp = sram_egt_dim1_2_node.next_dim;

NCKU CSIE CIAL Lab 42

1. COPY IXA_SDK_3.51, ixp_book 到 D:\ ; 再 reboot

3.[Ctrl+Enter] 進還原卡總管模式 4.Password: davidchang 5. 解壓縮 ixasdk351cd1windows.zip,

ixasdk351cd3.zip, ixasdk351framework.zip, 再依序安裝 (cd1 裝完後需 reboot)

6. 把 ixp_book 目錄 COPY 到 C:\