Date post: | 03-Jun-2018 |
Category: |
Documents |
Upload: | misha-kornev |
View: | 225 times |
Download: | 0 times |
of 25
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
1/25
ECE 526 Network
Processing Systems DesignIXP XScale and Microengines
Chapter 18 & 19: D. E. Comer
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
2/25
Ning Weng ECE 526 2
Overview Recalled
Packet processing functions (forwarding, queuing)
Traditional network processing systems (CPU + NICs)
General network processor architecture and tradeoffs
Intel IXP network processors overall architecture
Focus on individual components of Intel IXP chip Control processor (slow path): XScale core
Overall architecture
Typical functions
Processor features
Packet processing processor (fast path): Microengines
Architecture and features
Differences to conventional processors
Pipelining and multi-threading
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
3/25
Ning Weng ECE 526 3
Purpose of Control Processor
Functions typically executed by embedded control proc: Bootstrapping
Exception handling
Higher-layer protocol processing
Interactive debugging
Diagnostics and logging
Memory allocation
Application programs (if needed)
User interface and/or
interface to the GPP Control of packet processors
Other administrative functions
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
4/25
Ning Weng ECE 526 4
XScale Memory Architecture Memory architecture
Uses 32-bit linear address space
configurable endian mode
Byte addressable
Memory Mapping
Allocation of address space (2^32) to different systemcomponents
Accesses to memory is translated into access to component
Needs to be carefully crafted
XScale assumes byte addressable memory
Underlying memory uses different size (SDRAM)
How does this work?
Support for Virtual Memory For demand paging to secondary storage
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
5/25
Ning Weng ECE 526 5
Shared Memory Address Issues
Memory is shared between XScale and Microengines Same data, but different addresses
What impact does this have? Pointers need to be translated
Data structures with pointers can not be shared
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
6/25
Ning Weng ECE 526 6
Microengines
Microengines are data-path packet processors IXP IXP 2400 have 8 Microengines
Simpler than XScale
Low level device
as a micro-sequencer Optimized for
packet processing
More complex to use
Often abbreviated as uE
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
7/25
Ning Weng ECE 526 7
uE Functions
uEs handle ingress and egress packet processing: Packet ingress from physical layer hardware
Checksum verification
Header processing and classification
Packet buffering in memory
Table lookup and forwarding
Header modification
Checksum computation
Packet egress to physical layer hardware
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
8/25
Ning Weng ECE 526 8
uE Architecture
uE characteristics: Programmable microcontroller
RISC design
256 general-purpose registers
512 transfer registers
128 next neighbor registers
Hardware support for 8 threads and context switching
640 words of local memory
Control of an Arithmetic and Logic Unit
Direct access to various functional units
A unit to compute a Cyclic Redundancy Check (CRC)
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
9/25
Ning Weng ECE 526 9
uE as Micro-sequencer Micro-sequencer does not contain native instructions for
possible operations Instead of using instructions, uE invokes functional units to
perform operations
Control unit is much simpler
Example 1: uE does not have ADD R2,R3 instruction
Instead: ALU ADD R2, R3
ALU indicates that ALU should be used
ADD is a parameter to ALU
Example 2: Memory access not by simple LOAD R2, 0xdeadbeef
Instead: SRAM LOAD R2, 0xdeadbeef
Altogether similar to normal processor, but more basic
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
10/25
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
11/25
Ning Weng ECE 526 11
uE Memories
uEs: viewing memories differently than XScale does Does not map memories and I/O devices into a liner address
space
Does not view memories as a seamless, uniform repository
uE ISA: requiring a separate instruction for each type ofmemory and I/O device
SRAM[read, $$x, address1, address2]
Programmer: required binding of data items to specific
type of memory permanently.
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
12/25
Ning Weng ECE 526 12
Execution Pipeline
What is pipeline? Why pipeline is employed? One instruction is executed per cycle if pipeline is proper
designed
uEs use five-stage or six-stage pipeline:
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
13/25
Ning Weng ECE 526 13
Pipelining
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
14/25
Ning Weng ECE 526 14
Pipelining Problems Possible sources of pipelining problems
Data dependencies
Control dependencies
Resource dependencies
Memory accesses
How pipelining problem impact system performance
How these impact can be removed or reduced Remove the sources so that no stall happened
Hide the impact of pipelining stall
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
15/25
Ning Weng ECE 526 15
Pipeline Stalls K: ALU ADD R2, R1, R2
K+1 ALU ADD R3, R2, R3
Control dependencies, memory have even bigger impact
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
16/25
Ning Weng ECE 526 16
Threading Illustration
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
17/25
Ning Weng ECE 526 17
Hardware Threads uEs support 8 hardware thread contexts
One thread can execute at any given time
When stall occurs, uE can switch to other thread (if not stalled)
Very low overhead for context switch Zero-cycle context switch
Effectively can take around three cycles due to pipeline flush
Switching rules If thread stalls, check if next is ready for processing
Keep trying until ready thread is found
If none is available, stall uE and wait for any thread to unblock
Improves overall throughput
Questions: Why not 16, 32 threads
why not have 48 uEs with 1 thread?
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
18/25
Ning Weng ECE 526 18
Summary Control processor (slow path): XScale core
Overall architecture
Typical functions
Processor features
Packet processing processor (fast path): Microengines
Architecture and features Differences to conventional processors
Pipelining and multi-threading
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
19/25
Ning Weng ECE 526 19
Lab3 Brief Intel Reference Systems
SDK Tutorial
Lab 3
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
20/25
Ning Weng ECE 526 20
Intel Reference Systems Hardware Testbed
IXP2400 network processors
QDRM-SRAM, Flash ROM and other memories
1G optical ethernet ports
100M ethernet management port
Serial interface
PCI interfaces
SDK (software development kit) Compiler
Assembler, linker
Simulator
Reference codes
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
21/25
Ning Weng ECE 526 21
Lab3: Forwarding, Counting & Classification
Goal: to explore the basic functionalities of the IXP2400 softwaredevelopment kit and Microengines.
3 parts: Part I: collecting a number of workload statistics from the IXP SDK
simulator. Follow steps of lab instruction.
Part II: adding one counting block to count the number of packets.
Part III: implementing a simple packet classification mechanism.
Tools: All three parts require access to a machine that has the Intel
SDK installed. If you want, you can also request an installation CDfor your own machine, check with TA.
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
22/25
Ning Weng ECE 526 22
Part I: Forwarding Simulation run an implementation of IP forwarding on the IXP2400
simulator. All the code is provided to you.
collect a set of workload statistics that are reported bythe simulator.
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
23/25
Ning Weng ECE 526 23
Part II: Forwarding and Counting
modify above applications by adding counter block
store how many packets are received.
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
24/25
8/12/2019 L13 IXAXscaleMicroengine SDK tutorial
25/25
Ning Weng ECE 526 25
How to do Lab3 Windows machine with SDK installed
Download lab instructions and source code fromblackboard
Start early.
Very exciting lab. Due day
Part I and Part II 10/13
Part III 10/20