Post on 03-Feb-2022
transcript
1
The Scalable Communications Core: A Multi-Core Wireless
Baseband Prototype
Dr. Anthony (Tony) ChunDSP Architect
Wireless Communications LabCorporate Technology Group
Intel Corporationanthony.l.chun@intel.com
IEEE SCV Signal Processing Society, Feb. 9, 2009IEEE SCV Signal Processing Society, Feb. 9, 2009
IEEE Signal Processing Society2
Agenda
� Introduction
�Motivation
�Architecture
�Programming
�Test Chip
� Implementation Examples
� Learnings
�Summary
IEEE Signal Processing Society3
Introduction
� What is the Scalable Communications Core?– Flexible baseband
– Supports multiple communication standards
– Multi-core DSP
– Heterogeneous coarse-grained accelerators
– NoC interconnect
� Contributions– Developed area and energy-efficient architecture
– Developed programming technology
– Taped out first test chip
– Validated WiFi and WiMAX
– Mapped components of Bluetooth, DVB-H and GPS
IEEE Signal Processing Society4
Why is this work interesting?
�Intersection of several disciplines–Communications
–Signal Processing
–Algorithms
–Architecture
–On chip interconnect
–Programming tools
IEEE Signal Processing Society5
Agenda
� Introduction
�Motivation
�Architecture
�Programming
�Test Chip
� Implementation Examples
� Learnings
�Summary
IEEE Signal Processing Society6
Near Field Communication
60GHzUWB BluetoothWiMAX
WiFi
A,B,G,N 3GDTV GPS
Problems:• Large Form factor•Many SKUs• RF interference
Vision: Connectivity anytime, anywhere to any network
Motivation: Too Many Radios in Future Platforms
IEEE Signal Processing Society7
Agenda
� Introduction
�Motivation
�Architecture
�Programming
�Test Chip
� Implementation Examples
� Learnings
�Summary
IEEE Signal Processing Society8
SCC Architecture Overview
� Heterogeneous coarse-grained Processing Elements
– Each is programmable within its domain
– Support for multiple threads within PEs
– Stream processing
– Distributed memory
� Network-on-Chip (NoC) interconnect
– Packet-based
– Direct connection to nearest neighbors
– Stringent latency requirements
� Data-driven distributed control
– Control embedded within packet header
– Microcontroller is used only for low rate configuration
IEEE Signal Processing Society9
Flexibility Tradeoffs� Flexible architecture trades
three vectors
� ASIC: low flexibility, low power, small area
� Digital Signal Processor (DSP): high flexibility, high power, medium area
� FPGA: high flexibility, high power, large area
� SCC: medium flexibility, medium power, small area
For multiple basebands
High power
High flexibility
Large area
Conceptual
IEEE Signal Processing Society10
Flexibility Tradeoffs� Flexible architecture trades
three vectors
� ASIC: low flexibility, low power, small area
� Digital Signal Processor (DSP): high flexibility, high power, medium area
� FPGA: high flexibility, high power, large area
� SCC: medium flexibility, medium power, small area
For multiple basebands
High power
Large area
High flexibility
Conceptual
IEEE Signal Processing Society11
Flexibility Tradeoffs� Flexible architecture trades
three vectors
� ASIC: low flexibility, low power, small area
� Digital Signal Processor (DSP): high flexibility, high power, medium area
� FPGA: high flexibility, high power, large area
� SCC: medium flexibility, medium power, small area
For multiple basebands
High power
Large area
High flexibility
Conceptual
IEEE Signal Processing Society12
Flexibility Tradeoffs� Flexible architecture trades
three vectors
� ASIC: low flexibility, low power, small area
� Digital Signal Processor (DSP): high flexibility, high power, medium area
� FPGA: high flexibility, high power, large area
� SCC: medium flexibility, medium power, small area
For multiple basebands
High power
Large area
High flexibility
Conceptual
IEEE Signal Processing Society13
Flexibility Tradeoffs� Flexible architecture trades
three vectors
� ASIC: low flexibility, low power, small area
� Digital Signal Processor (DSP): high flexibility, high power, medium area
� FPGA: high flexibility, high power, large area
� SCC: medium flexibility, medium power, small area
�SCC solution offers best combination of energy efficiency, area efficiency and flexibility
For multiple basebands
High power
Large area
High flexibility
Conceptual
IEEE Signal Processing Society14
Observation: Many Commonalities Between Wireless Standards
√√√√√√FIR / IIR
√Spreading
√
√
√
√
√
√
√
√
3G
√√√√CRC
√√√√√Randomization
√√√√Reed-Solomon Coding
√Turbo Coding
√√√√√Convolutional Coding
√√√√√Interleaving
√√√√√QAM Mapping
√√√√√Channel Estimation
√√√√√FFT
√√√√√Correlation
60GHzUWBDVB-TWiMaxWiFiAlgorithm
Wireless standards share many of the same DSP algorithms
IEEE Signal Processing Society15
Architecture Considerations
� Large superset of protocols, but only a few are active concurrently
�Complex control procedures with strict timing requirements
�Pipelined data flow through protocol stack
�Must support variable data block sizes
�Must be able to constrain timing jitter and latency
IEEE Signal Processing Society16
Solution: Heterogeneous Processors on a 3-ary 2-cube NoC
� Heterogeneous Processing Elements
– Digital Front End (DFE)
– Data-Stream Processing Engine (DPE)
– Interleaving (ILV)
– High-Speed Viterbi (HSV)
– Low Power Viterbi (LPV)
– Turbo-Decoder (TD)
– Convolutional Coder (CC)
– Reed-Solomon Decode(RSD)
– Reed-Solomon Encode(RSE)
� 3-ary 2-cube NoC Data Plane
� 32-bit ARC™ RISC Processor
� 32-bit OCP™ Control Plane
� PLME Mailboxes
IEEE Signal Processing Society17
Solution: Scalable Communications Core
A Baseband Processor for WiFi, WiMAX, and DVB Multi-radio
� Heterogeneous Processing Elements
– Digital Front End (DFE)
– Data-Stream Processing Engine (DPE)
– Interleaving (ILV)
– High-Speed Viterbi (HSV)
– Low Power Viterbi (LPV)
– Turbo-Decoder (TD)
– Convolutional Coder (CC)
– Reed-Solomon Decoder (RSD)
– Reed-Solomon Encoder (RSE)
� 3-ary 2-cube NoC Data Plane
� 32-bit OCP™ Control Plane
� PLME Mailboxes
� 32-bit ARC™ RISC Processor
Heterogeneous Processing Elements on 3-ary 2-cube NoC
IEEE Signal Processing Society18
Data Stream Processing Element
x1 x2
...
...
NCO1
#
Add
ress
Gen
erat
or
StackCore Data Router Adaptor
VLIW microcode
OCPRegs
To/From Mesh
EngineCore
� 16-bit microcontroller (StackCore)� Configuration
� Micro/Macro-sequencing
� Scalar arithmetic
� Programmed using C or assembly
� Complex DSP machine (EngineCore)� Highly reconfigurable data path
� Crossbar connections
� Complex mult, add, sub, shift, round, sat, trunc, conj.
� Split VLIW microcode –� Long Configuration Words
� Long Address Words
� Address Generators
� Stream programming model
IEEE Signal Processing Society19
Reed-Solomon Decoding� Maximum throughput
� 84.2Mbps ATSC
� 105.8Mbps DVB-H (PHY)
� 22.9Mbps DVB-H (MPE)
� Up to 4 resident configurations
� GF(2m); m<=8
� T<=32
� g(x)=(x+1)(x+a)…(x+a2T-1)
� p(x)=c0xm+ c1x
m-1… cm-1x+1
� Up to 4 simultaneous streams
� Example supported standards:
� ATSC
� DVB-H
� 802.16de
� ITU-T J.83
� Integrated clock gating
� Fine grained power management
Input DMA & Codeword
Reassembly
Error Correction & Output DMA
Header Table RAM
Code Profile Registers
Switch Matrix
Codeword RAM
Codeword RAM
to mesh
from mesh
OCP Slave
Socket
control
data
Syndrome Calculator (Horner’s
Rule)
Key Equation Solver
(Berlekamp-Massey
Algorithm)
Error Locator & Evaluator
(Chien Search & Forney
Algorithm)
1/x LUT
Context RAM
1/x LUT
Codeword RAM
Codeword RAM
Codeword RAM
Codeword RAM
Codeword RAM
Codeword RAM
IEEE Signal Processing Society20
Opportunity: Radio Composition using Shared Resources
� Smaller – reduce redundancy by sharing resources
� More Energy Efficient – reduced redundancy equates to lower leakage
� Scalable – can easily add new processing elements to cover emerging standards
� Wider Roaming – can compose radios on-the-fly based on signals detected in the air
� Improved Coexistence – wider array of future interference mitigation and coordination options
� Potential Time to Market Reduction – future drag and drop methodology for building a multi-radio baseband processor using well characterized processing elements on a flexible and scalable interconnect
IEEE Signal Processing Society21
Dataflow & Resource Sharing: WiFi vs. Mobile WiMax TX Case
Shared Shared
ResourcesResources
Mobile WiMAX WiFi
IEEE Signal Processing Society22
Dataflow & Resource Sharing:Fixed WiMAX vs. DVB RX Case
Fixed WiMAX DVB
Shared Shared
ResourcesResources
IEEE Signal Processing Society23
Distributed MemoryMemory Bandwidth
0.000E+00
2.000E+10
4.000E+10
6.000E+10
8.000E+10
1.000E+11
1.200E+11
1.400E+11
802.11n 802.16e DVB
Acc
esse
s pe
r Sec
ond
Cumulative
Single Stream
Memory Bandwidth
0.000E+00
1.000E+09
2.000E+09
3.000E+09
4.000E+09
5.000E+09
6.000E+09
7.000E+09
802.11n 802.16e DVB
Acc
esse
s pe
r Sec
ond
Cumulative
Single Stream
Number of Ports vs. Clock Frequency
0100200300400500600700800900
1000
125 250 500
Clock Frequency (MHz)
Num
ber of
Por
ts R
equire
d
DVB
802.16e
802.11n
DSP + FEC DSP alone
Shared memory not practical – distributed memory required for bandwidth.
Number of Ports vs. Clock Frequency
0
10
20
30
40
50
60
125 250 500
Clock Frequency (MHz)
Req
uire
d Por
tsDVB
802.16e
802.11n
IEEE Signal Processing Society24
Power vs. Flexibility
0.000E+00
5.000E+10
1.000E+11
1.500E+11
2.000E+11
2.500E+11
0 100 200 300
Flexibility Metric
Pow
er M
etric NoC
Sparse OCP matrix
Split OCP Matrix
Full OCP Matrix
Interconnect Considerations
DPE0
DPE1
DFE0
DFE1
DFE2
HSV
ILV
RSE
CC
RSD
LPV
TD
MAC0
MAC1
MAC2
DPE0
DPE1
DFE0
DFE1
DFE2
HSV
ILV
RSE
CC
RSD
LPV
TD
MAC0
MAC1
MAC2
DPE0
DPE1
DFE0
DFE1
DFE2
HSV
ILV
RSE
CC
RSD
LPV
TD
MAC0
MAC1
MAC2
DPE0
DPE1
DFE0
DFE1
DFE2
HSV
ILV
RSE
CC
RSD
LPV
TD
MAC0
MAC1
MAC2
Full Matrix (shared bus) Split Matrix (segmented bus) Sparse Matrix
3-ary 2-cube NoC
NoC provides lowest NoC provides lowest
power with maximum power with maximum
flexibilityflexibility
IEEE Signal Processing Society25
NoC Issues
� Latency – caused by multiple streams contending for a shared interconnect
� Jitter – caused by time division multiplexing with variations in workload
IEEE Signal Processing Society26
Using Fragmentation to Constrain Latency
Single Long Single Long
PacketPacket
Many Small FragmentsMany Small Fragments
IEEE Signal Processing Society27
Using Time Division Multiplexing to Share Interconnect Segments
DSP blocks DSP blocks
for transferfor transferMultiplexed Multiplexed
fragmentsfragmentsDemultiplexed Demultiplexed
DSP blocksDSP blocks
IEEE Signal Processing Society28
Using Timestamps to Constrain Jitter
0
1
1
3Input
Timestamps
Output
2
32
0 1f (x,y...z) 2
4
t0
4
t1 t2 t3 t4 t5
Time
reference
=
timestampfalse
true
outputinput
5
5
f (x,y...z)
3 4 5
Time
...
...
...
Router
south
north
west
0
east
Packets arrive with jitterPackets arrive with jitter Functions complete with jitterFunctions complete with jitter
Output Output
transmission is transmission is
precisely timedprecisely timed
IEEE Signal Processing Society29
Data Driven Processing: Using a System of Tags to form Linked Lists
Stream IDStream ID
references a references a
context for context for
multimulti--stream stream
processingprocessing
Function IDFunction ID
references references
function function
parametersparameters
Output headerOutput header
contains route to contains route to
next PE, FID, & next PE, FID, &
SIDSID
IEEE Signal Processing Society30
NoC Performance Requirements
1898per channel
(aggregate)
314DVB
336802.16e
1248802.11n
Throughput
(Mbps)
Protocol
0.6per channel
(7 hops)
5.8
4.2
PE Budget
NoC Budget
6.0
10.0
MAC Budget
PHY Budget
16.0802.11n SIFS
Latency
(µs)
Budget
Worst Case NoC Throughput:(RX coded soft-bits @8 bits/soft-bit)
Worst Case NoC Latency:(802.11n SIFS timing budget)
IEEE Signal Processing Society32
Latency is Constrained by Packet Size Not by Choice of Routing Algorithm
IEEE Signal Processing Society33
Agenda
� Introduction
�Motivation
�Architecture
�Programming
�Test Chip
� Implementation Examples
� Learnings
�Summary
IEEE Signal Processing Society34
Programming Technology Challenges
�Vision: program the architecture as if it was a single DSP
–We are not there yet
�Programming of heterogeneous accelerators
–Degree of programmability varies i.e. DPE is more programmable than Viterbi decoder
–Compilers for DPE and ILVPE
–Other PEs are configured via registers
�Parallel programming model is in progress
IEEE Signal Processing Society35
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
Mapping of 802.11n Rx
PPL (Parallel Programming Language) describes protocol mapping
IEEE Signal Processing Society36
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
DFE
PPL (Parallel Programming Language) describes protocol mapping
IEEE Signal Processing Society38
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�ProfileDPE
PPL (Parallel Programming Language) describes protocol mapping
IEEE Signal Processing Society39
Programming SCC
�Map algorithms
DPE
FFT
Phase
Tracking
Soft Bits
FFT
Phase
Tracking
Soft Bits
Channel Estimation
Equalizer / Spatial Demapper
IEEE Signal Processing Society40
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�ProfileILVPE
PPL (Parallel Programming Language) describes protocol mapping
IEEE Signal Processing Society42
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
HVDPPL (Parallel Programming Language)
describes protocol mapping
IEEE Signal Processing Society44
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
CCPE
PPL (Parallel Programming Language) describes protocol mapping
IEEE Signal Processing Society46
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
ARCPPL (Parallel Programming Language)
describes protocol mapping
IEEE Signal Processing Society48
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
PPL (Parallel Programming Language) describes protocol mapping
Algorithms mapped to PEs
IEEE Signal Processing Society50
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
DPE configuration for 64 pt radix-4 FFT
DPE Example: FFT
IEEE Signal Processing Society51
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
DPE Example: FFT
IEEE Signal Processing Society52
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
selector[iter:16] dataSelector1[index:4] = {{ iter + index * 16}};
selectorselector[iter:16] dataSelector1[index:4] = {{ [iter:16] dataSelector1[index:4] = {{ iter + index * 16}};iter + index * 16}};
Data stream Programming
Language
DPE Example: FFT
IEEE Signal Processing Society53
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
selector[iter:16] dataSelector2[index:4] =
{{(iter%4) + (iter/4) * 16 + index * 4}};
selectorselector[iter:16] dataSelector2[index:4] = [iter:16] dataSelector2[index:4] =
{{(iter%4) + (iter/4) * 16 + index * 4}};{{(iter%4) + (iter/4) * 16 + index * 4}};
DPE Example: FFT
selector[iter:16] dataSelector1[index:4] = {{ iter + index * 16}};
selectorselector[iter:16] dataSelector1[index:4] = {{ [iter:16] dataSelector1[index:4] = {{ iter + index * 16}};iter + index * 16}};
Data stream Programming
Language
IEEE Signal Processing Society54
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
selector[iter:16] dataSelector3[index:4] = {{iter * 4 + index}};
selectorselector[iter:16] dataSelector3[index:4] = [iter:16] dataSelector3[index:4] = {{iter * 4 + index}};{{iter * 4 + index}};
DPE Example: FFT
selector[iter:16] dataSelector1[index:4] = {{ iter + index * 16}};
selectorselector[iter:16] dataSelector1[index:4] = {{ [iter:16] dataSelector1[index:4] = {{ iter + index * 16}};iter + index * 16}};
Data stream Programming
Language
selector[iter:16] dataSelector2[index:4] =
{{(iter%4) + (iter/4) * 16 + index * 4}};
selectorselector[iter:16] dataSelector2[index:4] = [iter:16] dataSelector2[index:4] =
{{(iter%4) + (iter/4) * 16 + index * 4}};{{(iter%4) + (iter/4) * 16 + index * 4}};
IEEE Signal Processing Society55
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
IEEE Signal Processing Society56
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
IEEE Signal Processing Society57
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
IEEE Signal Processing Society58
Programming SCC
�Map algorithms
�Code algorithms
�Build
�Debug
�Simulate
�Profile
IEEE Signal Processing Society59
Agenda
� Introduction
�Motivation
�Architecture
�Programming
�Test Chip
� Implementation Examples
� Learnings
�Summary
IEEE Signal Processing Society60
Onega Test Chip
�65nm process
�Taped out in Dec 2007
�Subset of PEs included
IEEE Signal Processing Society62
Silicon Results� Process technology: 65nm
� Silicon area (excl. pads): 20.75mm2
� Program memory-DPE1, DPE2 and microcontroller: 96+96+32=224kbytes
� Data memory-DPE1, DPE2 and microcontroller: 16+256+8=280kbytes
� Logic gate count: 1.36M
� Supply Voltage: 1.1V Core, 3.3V I/O
� Package: WB-PBGA 31x31 mm
� Signal I/O Count: 332
� Measured Clock frequency: 233MHz
� Paper summarizing power measurements has been submitted
to ISVLSI 2009
IEEE Signal Processing Society63
Areas of Processing Elements
0.73Configuration and controlARC
0.091InterconnectNoC
0.21CRC, randomization, codingCCPE
0.09RS encodingRSE
0.26RS decodingRSD
0.21Low power Viterbi decodingLPV
1.57puncture, interleave, multiplexILV
4.698k-point FFT/IFFT, chn eq, QAMDPE2
2.1364-point FFT/IFFT, chn eq, QAMDPE1
3.600AGC, resample, filter, detectDFE
Areamm2
ClassificationLabel
IEEE Signal Processing Society64
Agenda
� Introduction
�Motivation
�Architecture
�Programming
�Test Chip
� Implementation Examples
� Learnings
�Summary
IEEE Signal Processing Society65
Protocol Implementations (to date)
�802.11a/n –subset of MCSs (limited by Viterbi decoder rate
of 54 Mbps) tested on Onega silicon
–16 µs SIFS requirement met
�802.16e: range of modes validated on Onega silicon
�DVB-H: Rx on RTL simulator
�Bluetooth: modulation and demodulation on DPE simulator
�GPS: code acquisition and tracking on DPE simulator
IEEE Signal Processing Society66
802.11a High Rate Inter-Symbol Control
MACARCCCLPVILVDPEDFERFIC
ccaRstccaRst
samplessts
ltssignal
psdu
signalsignal
signal
rxStart
psdupsdu
psdu
samplessamples
samples
rxCfgrxCfg
psdu
samples
psdupsdu
psdupsdu
psdu
rxEndrxEnd
...
... ... ... ... ...
Mailbox
NoCOCP
rxCfg
rxOn
samples
rxOff
ccaBusyccaBusy
htSig
htSightSig
htSig
sampleshtSightSts
htLts...
htLts
samples
samplessamples
...
samples
rxCfg
(packets)(dwords)
(messages)
...
ADC
(samples)
Symbols arrive at Symbols arrive at
250kHz rate tagged 250kHz rate tagged
by typeby type
Header delivered Header delivered
to ARCto ARC™™Payload Payload
delivered to delivered to
MACMAC
IEEE Signal Processing Society67
802.11a Low Rate Inter-Frame Control
MACARCCCLPVILVDPEDFERFIC
ccaRstccaRst
samplessts
ltssignal
psdu
signalsignal
signal
rxStart
psdupsdu
psdu
samplessamples
samples
rxCfgrxCfg
psdu
samples
psdupsdu
psdupsdu
psdu
rxEndrxEnd
...
... ... ... ... ...
Mailbox
NoCOCP
rxCfg
rxOn
samples
rxOff
ccaBusyccaBusy
htSig
htSightSig
htSig
sampleshtSightSts
htLts...
htLts
samples
samplessamples
...
samples
rxCfg
(packets)(dwords)
(messages)
...
ADC
(samples)
ARCARC™™ initiates RX initiates RX
operationoperationARCARC™™ processes header and processes header and
adjusts configurationadjusts configurationARCARC™™
terminates RX terminates RX
operationoperation
IEEE Signal Processing Society68
GFSK modulator / demodulator
Modulator (Transmitter)
Choose Tx sampling rate as 8 Ms/s.
Demodulator (Receiver)
Choose Rx sampling rate as 8 Ms/s.
p[k]S/P
3 bits
Look-up Table
8 samples
Z-1
P/Ss[n]
y[i], y[i+1], ���, y[i+7]
y(i+7)
Z-8
Z-8
I(n)
Q(n)Q(n-8)
I(n-8)+
-
X[n]
1/0
Z-8
++
Z-1
m[n]-
w[n]
Demodulator operates at 200 kbps (goal is 1 Mbps)
Demodulator operates at Demodulator operates at
200 kbps (goal is 1 Mbps)200 kbps (goal is 1 Mbps)
IEEE Signal Processing Society69
GPS Code Acquisition
GPS C/A Code Gold Code 1 Gold Code 2 Gold Code 3
To ARC
Peak valueDoppler shiftCode shift
Logic
al
packets
DPE Processing
End of receiving GPS C/A Code data
End of processing Gold Code 1
FFT
FFT
C/A Code data
Gold Code
IFFT
Doppler correction
|.|2Find Max
loop over all doppler bins
Find Max
loop over all satellites (different Gold codes)
Onega Data Flow
Peak
Dopple
rIn
dex
Sate
llite #
Peak
Dopple
rIn
dex
Acquires four satellites in 9 ms.
Acquires four satellites in 9 Acquires four satellites in 9
ms.ms.
IEEE Signal Processing Society70
Agenda
� Introduction
�Motivation
�Architecture
�Programming
�Test Chip
� Implementation Examples
� Learnings
�Summary
IEEE Signal Processing Society71
Learnings
�Heterogeneous coarse-grained PE NoC architecture validated as real-time wireless baseband
�Area and power competitive with fixed solutions
�Stream programming model and tools developed
�General parallel programming tool for entire set of PEs remains a goal
IEEE Signal Processing Society72
Agenda
� Introduction
�Motivation
�Architecture
�Programming
�Test Chip
� Implementation Examples
� Learnings
�Summary
IEEE Signal Processing Society73
Summary
�We have demonstrated a flexible radio baseband
�Taped out test chip, programmed it, validated and measured power
�Next steps
–Implement additional protocols
–Improvements to the architecture
–Can our learnings be applied to other signal processing applications?
IEEE Signal Processing Society74
Acknowledgements
Aliaksei Chapyzhenka, Anton Bobkov, Vladimir Pudovkin, Veronica Mikheeva, Alexey Kostyakov, Tatiana Stounina, Victoria Slavinskaya, Mariano Aguirre, Jorge Carballido, Arturo Veloz, David Arditti, Brando Perez Esparza, Victor Rivera
Alvarez, Carlos Ornelas, Luis Cuellar, and Edgar Borrayo Sandoval, Jeffrey Hoffman, Thomas
Tetzlaff, Frank Carroll, Kyle McCanta, Jenny Chang, Jane Lin, Kapil Gulati, David Bormann, Denise Souza, Kirk Skeba, Ernest Tsui, Inching Chen