+ All Categories
Home > Documents > Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel,...

Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel,...

Date post: 19-Dec-2015
Category:
View: 215 times
Download: 0 times
Share this document with a friend
65
Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian Flautner University of Michigan 1 ARM Ltd
Transcript
Page 1: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Software Defined Radio – A High Performance Embedded Challenge

Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1Krisztian Flautner

University of Michigan1ARM Ltd

Page 2: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 2

Contents

Software defined radio Categories of wireless networks Core technologies for future networks Case study : W-CDMA Network

Major algorithmsWorkload characterizationArchitectural implications

Page 3: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Software Defined Radio

Page 4: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 4

Wireless Communication System

Upper Protocol Layers

Physical Layer (PHY)

Application bits

BasebandProcessing

AnalogFront-end

Packets “Air”

MAC

LINK

Network

Transport

PPP

IP

TCP/UDP

Page 5: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 5

Anatomy of Cellular Phone

Bluetooth

GPS

BasebandProcessor

AnalogFrontend

ApplicationProcessor

PowerManager

Camera

Keyboard

Display

Speaker

Page 6: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 6

AudioAMR/QCELP

PHY

MAC

Protocol on Wireless Platform

Upperlayers

Physicallayer

LINK

Network

Transport

ASIC(Hardware)

GPP(Software)

VideoMPEG

GPP(Software)

DSP/AcceleratorSource

coding

ApplicationProcessor

BasebandProcessor

Page 7: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 7

Software Defined Radio (SDR)

Use software routines instead of ASICs for the physical layer operations of wireless communication system

ASICs(PHY)ASICs(PHY)

ProgrammableHardware

ProgrammableHardware

SoftwareRoutinesSoftwareRoutines

Both Analog Frontend and Digital Baseband are the scope of SDR

Page 8: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 8

Levels of SDR

Tier Name Description

Tier 0 Hardware Radio (HR)Implemented using hardware components. Cannot be modified

Tier 1Software Controlled

Radio (SCR)Only control functions are implemented in software: inter-connects, power levels, etc.

Tier 2Software Defined

Radio (SDR)

Software control of a variety of modulation techniques, wide-band or narrow-band operation, security functions, etc.

Tier 3Ideal Software Radio

(ISR)Programmability extends to the entire system with analog conversion only at the antenna.

Tier 4Ultimate Software

Radio (USR)Defined for comparison purposes only

<source:http://www.sdrforum.org>

Page 9: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 9

Why we need SDR ? Seamless wireless connection – End User

Widely different wireless protocols TDMA : GSM, AMPS CDMA : IS-95, cdma2000, W-CDMA, IEEE 802.11b OFDM : IEEE 802.11a/g/n, WiMAX

Needs a terminal that can support multiple wireless protocols

Easy infrastructure upgrade – Service Provider Wireless protocols evolve continuously

Ex) W-CDMA W-CDMA + HSDPA

Time to market – Manufacturer Reduce hardware development time and cost

Page 10: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 10

Where can we use SDR ?

Basestations Weak constraints on power and area Support several hundred subscribers Will be commercialized first

Wireless terminals Tight constraints on power and area. Will be commercialized next

Page 11: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 11

Why SDR is challenging ?

Analog Frontend Must be tunable across a range of carrier frequencies and

bandwidths.

Digital Baseband Super computer level computation power.

> 50 Gops per subscriber Tight power budget.

200 ~ 300 mW (@terminal) High level of programmability.

Combination of heterogeneous signal processing algorithms.

Page 12: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 12

Our Strategy

Performance Exploit the parallelism in signal processing and forward error

correction (FEC) algorithms

Power Limit the programmability to minimize power consumption. Minimize both active and idle mode power consumption

There exists trade off between power efficiency and programmability

Page 13: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Categories of Wireless Networks

Page 14: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 14

Categories of Wireless Networks

<source : Wireless communication technology landscape, DELL >

WPAN :Personal Area Connectivity10 meters

WLAN :Local Area Connectivity100 meters

WMAN :Metro Area Connectivity(City or suburb)

WWAN :Wide Area Connectivity(Broad geographiccoverage)

Beyond 100 meters

Bluetooth, UWB WiFi, HiperLan WiMaxAMPS, GSM, IS-95cdma2000, W-CDMA

Page 15: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 15

WWAN (Wireless Wide Area Network)

AMPS

FDMA

IS-95

GSM

IS-136/PDC

IS-95B

CDMA CDMA

cdma2000

CDMA

GPRS

EDGE

W-CDMA

TDMA

CDMA

TDMA

W-CDMA/HSDPA

cdma2000EV,DO,DV

TDMA TDMA

?

CDMA

CDMA

OFDM

1G 2G 2.5G 3G 3.5G 4G

Analog Digital

FDMA

CDMA

TDMAOFDM

Voice 64~384K Packet ~2M Multimedia ~10M Multimedia ~100M Multimedia

Can be Implemented by Programmable DSP No fully programmable H/W solutions

NMTTACT

FDMA

Page 16: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 16

WLAN / WMAN

802.11b11Mbps

CDMA

802.11g54Mbps

OFDM

802.11a54Mbps

OFDM

802.11n100+Mbps

OFDM

WMAN : Wireless Metro Area Network For last mile problem 802.16d : Fixed WiMax 802.16e : Mobile WiMax

WLAN : Wireless Local Area Network High data rate Poor mobility support

WiMax802.16d

WiMax802.16e

OFDM OFDM

70Mbps 10Mbps

Page 17: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 17

WPAN (Wireless Personal Area Network)

Bluetooth1.1

1 Mbps

Bluetooth1.2

Bluetooth2.0

3 Mbps

802.15.3aUWB

100 ~ 480 Mbps

802.15.3aUWB-NG~ 1Gbps

Interconnecting personal devices

Page 18: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Core technologies of future networks

Page 19: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 19

OFDM (Orthogonal Frequency Division Multiplexing)

X(f) XIFFT(f)

IFFT

0 fsc Nfsc

….….

-Nfsc -fsc

modulation

cos(fct)

Xmod(f)

-fc -fc+Nfsc

….….

-fc+Nfsc fc fc+Nfsc

….….

fc-Nfsc

demodulation

cos(fct)

Xdemod(f)

0 fsc Nfsc

….….

-Nfsc -fsc

FFT

X(f)

Wireless Channel

Transmit signal over several sub-carriers. Frequency spectrum of sub-carriers are overlapped. (High spectral efficiency) Highly susceptible to frequency error in receiver.

0 fsc Nfsc

….….

-Nfsc -fsc

Page 20: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 20

Major Computation in OFDM system

FFT / IFFT

N = 64 : IEEE 802.11a N = 256~2048 : IEEE 802.16 WiMax Data precision : 12~16bits

Amount of computations for OFDM operation ~ 108 complex multiplications / sec

21

0

( ) [ ] , 0,.., 1N j kn

N

n

X k x n e k N

Page 21: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 21

MIMO (Multiple Input Multiple Output) Use multiple antennas for signal transmission and reception In ideal case, linearly increase channel capacity Can effectively compensate multipath fading effect Significantly increase receiver complexity

tx rx

<Single Input Single Output (SISO)>

Channel Capacity

C = W log2(1+SNR)<Multiple Input Multiple Output (MIMO)>

Channel Capacity

C = min(n, m) * W log2(1+SNR)

......

Tx,1

Tx,2

Tx,n

Rx,1

Rx,2

Rx,m

Page 22: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 22

Computation in MIMO receiver

Amount of computation in MIMO receiver

M : # of Tx/Rx antenna LT : Length of preamble

LP : Length of payload

4 Tx/Rx antenna, 100 Mbps, 64 QAM, ½ coding rate ~ 6 x 108 Computations / Sec

<source: B. Hassibi, An Efficient Square-Root Algorithm for BLAST>

2 32

292 (log )

3T pM L L M

Page 23: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 23

LDPC code

Low Density Parity Check (LDPC) code Turbo code like coding gain with lower implementation cost.

Encoding Matrix multiplication, c = xG G (Generator matrix) is large matrix. (e.g. 4K X 4K matrix)

Decoding Equivalent to find most probable vector x such that Hx mod 2 =

0. H (Parity check matrix) is large sparse matrix.

Implementation There exist trade-off between coding gain and implementation

complexity

Page 24: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 24

Hybrid ARQ Reuse error frames for the decoding of retransmitted frame Require huge buffer space

Link Layer

2error

2error

Store at hybridARQ buffer

3

- At this point, detects that frame #2 is missed- request the retransmission of frame #2 to sender

4

2error

2ret

2error

+

Combine error frame withretransmitted frame

5

time

PhysicalLayer

1

Page 25: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Case Study : W-CDMA system

Page 26: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Major Algorithms

Page 27: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 27

Physical layer of W-CDMA

LPF-Tx scrambler spreader InterleaverChannelencoder

LPF-Rx

searcher

descrambler despreader combiner

descrambler despreader

...

modulator

demodulator

deinteleaverChanneldecoder

(turbo/viterbi)

Upper layersTransmitter

Receiver

D/A

A/D

Frontend

Error Correction

Overcome severe error in short time interval

Assign signal waveform optimal for data transmission

Suppress the signal term in outside of stop band

Page 28: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 28

Channel Encoder/Decoder

Encoder Add systematic redundancy on source data

Decoder Fix errors on received data with the systematic redundancy

information generated by encoder

W-CDMA system uses Convolutional code (for short voice and control message) Turbo code (for video stream and high speed packet data)

Page 29: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 29

Channel Encoder

Consists of flip-flops and exclusive OR gates Has negligible impact on workload

Output 0

G 0 = 561 ( octal)

Input

D D D D D D D D

Output 1

G 1 = 753 ( octal)

<convolutional encoder of W-CDMA system>

Page 30: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 30

Channel Decoder

Determine maximally probable code sequence from the received sequence.

Select C having minimum distance with received sequence r

One of dominant workload

C1C2

CN

rd1 d2

dN

.

.

.

- {ci} : code set

- r : received signal

Page 31: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 31

Channel Decoder – Viterbi Algorithm Most popular decoding algorithm of convolutional code Consists of three steps:

Branch metric calculation (BMC) abs(a-b), Parallelizable

Add compare select (ACS) min(a+b, c+d), Parallelizable

Trace back (TB) Recursive pointer tracing, Sequential

Amount of operation in W-CDMA 16Kbps voice : ~2Gops

Page 32: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 32

Channel Decoder –Turbo decoder

Two algorithms are widely used SOVA (Soft Output Viterbi Algorithm)

Less computation intensive Lower error correction performance

Max-LogMap algorithm More computation required Higher error correction performance

Amount of operation in W-CDMA For 128 Kbps streaming data : ~18 Gops

Page 33: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 33

Turbo Decoder

<High level block diagram of turbo decoder>

SOVA/Max-LogMap

SOVA/Max-LogMap

Interleaver

deinterleaver

demux

Input

output

OneIteration

Based on the multiple iteration of SOVA / Max-LogMap blocks. More iterations show better performance.

Page 34: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 34

Block Interleaver/Deinterleaver Overcome severe signal

attenuation within short time interval which frequently appears at wireless channel.

Interleaver (@transmitter): Randomize the sequence of source

data. Deinterleaver (@receiver):

Recover original sequence by reordering.

Amount of operation : < 10 Mops<example of signal strength variation>

123456789Interleaving Deinterleaving

147258369 123456789 147258369

Page 35: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 35

Spreader/Despreader

Allow the transmission of several signals at the same time. (x[n] and y[n] in the below diagram)

It is based on the orthogonality between spreading codes

x[n]

y[n]

x[n]

y[n]

ci[n]

cj[n]

ci[n]

cj[n]

spreader despreader

11

0

1, if[ ] [ ]

0, otherwize

N

i jNn

i jc n c n

<orthogonality between codes>

Page 36: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 36

Spreader/Despreader

x[n]

Ci[n]

f

X(f)

f

Xsp(f)

N[n]

xsp[n] r[n]

f

r(f)

Ci[n]

rdesp[n]

f

rdesp(f)

y[n]

f

y(F)

Spreader DespreaderWireless Channel

Noise signal isspreaded

Spreader / Despreader also suppress noise

Amount of operation : ~4 Gops

Page 37: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 37

Scrambler/Descrambler Randomize the output signal by multiplying pseudo random sequence

so called scrambling code. Allow multiple terminals to communicate at the same time. Amount of operation : ~ 3 Gops

Terminal 1, with scrambling code n

Terminal 2, with scrambling code m

x[n]

y[n]

x[n]

y[n]

csc,i[n]

csc,j[n]

c*sc,i[n]

c*sc,j[n]

Scrambler Descrambler

Complexmultiplication

Complexmultiplication

Complexmultiplication

Complexmultiplication

Page 38: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 38

Low Pass Filter Suppress the signal terms at the outside of stop band

frequency.

<Input signal><Output signal>

Filtering

Time domain

Freq. domain

Impulse signal sinc function

Band limited signalBand unlimited signal

Page 39: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 39

Low Pass Filter

Use conventional FIR filter1

0

[ ] [ ]N

ii

y n h x n i

z- 1 z- 1

h0

x[n]

h1 hN- 1

z- 1

h2

y[n]

x[n- 1] x[n- 2] x[n- N+1]

Number of filter tap (N) = 32 ~ 64 Amount of operation : ~ 12 Gops

Page 40: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 40

Rake Receiver – Multipath fading

Rake receiver mitigates multipath fading effect Multipath fading is a major cause of unreliable wireless

channel characteristic

x(t)

y(t) = a0x(t)y(t) = a0x(t)+a1x(t-d1)y(t) = a0x(t)+a1x(t-d1)+a2x(t-d2)

Page 41: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 41

Rake Receiver - Functions

Ideally the function of rake receiver is to aggregate the signal terms with proper delay compensation

y(t) = a0x(t)+a1x(t-d1)+a2x(t-d2)

r(t) = a0x(t-tdealy)+a1x(t-d1-dest1)+a2x(t-d2-dest2)

= (a0+a1+a2) * x(t-tdelay)

Rake receiver

delaytdelayt

We need to know delay spread of received signal that randomly varies

Page 42: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 42

Rake Receiver – Detect Delay Spread

Scan the received signal in frame buffer while computing correlation with scrambling code sequence.

Received signalCorrelation

window

Correlation Result

a0

a1

a2

0 d1 d2

0 1 1 2 2[ ] [ ] [ ] [ ]y n a x n a x n d a x n d

Page 43: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 43

Computation of Rake Receiver

Correlation computation : LWLBF LW : Correlation window = 320 LB: Frame buffer size = 5120 F : Operation Frequency = 50 ~ 80 Mega Multiplications / sec Multiplications can be converted into subtraction

Amount of operation in W-CDMA : ~25 Gops Most dominant workload

Page 44: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 44

Rake Receiver – Overall Architecture

Searcher

Descrambler/Despreader

Descrambler/Despreader

Descrambler/Despreader

combiner

Delay

Delay

Delay

r(t)

d1, d2, d3 a1, a2, a3

Detects delay spread

Compensates propagation delay recombine signal terms without delay

Page 45: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 45

Power Control Receiver controls the transmission power of transmitter in order to minimize the

interference to other users. Required computation is negligible

Terminal Basestation

Refrence level

u d u u d d u

Strength of pilot signal is below the reference level

Terminal sends UP command

Strength of pilot signal is above the reference level

Terminal sends DOWN command

: Pilot Signal

u : Power Control Command

Page 46: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 46

H/W operation states

Radio resource control state defined in W-CDMA specification

operation states defined according to H/W activity

Idle

Control Hold

Active

• For long idle period between sessions• Periodic wake up for control message reception• Minimum workload but dominate terminal standby time

• For short idle period between packet burst• Hold narrow control channel for fast transition to Active • Intermediate workload

• For packet burst transmission period• Use high speed packet channels up to 2Mbps• Most heavily loaded state

Page 47: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Workload Characterization

Page 48: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 48

Workload Profile One operation is equivalent to one RISC instruction

0

5

10

15

20

25

30

Searc

her

Inte

rleav

er

Deinte

rleav

er

Viterb

i Enc

oder

Viterb

i Dec

oder

Turbo

Enc

oder

Turbo

Dec

oder

Scram

bler

Descr

amble

r

Scram

bling

-cod

e(Tx)

.

Scram

bling

-cod

e(Rx)

Sprea

der

Despr

eade

r

Combin

er

LPF(

Tx)

LPF

(RX)

Power

Cot

nrol

[GO

PS

]

Idle state

Control hold state

Active state

Searcher, Turbo decoder, and LPF are dominant workloads Workload profile varies according to operation state

Page 49: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 49

Processing Time Requirement

Mixture of algorithms with various processing time requirements Classified into two categories

Heavy workload with long processing time (turbo decoder, searcher) Light workload with short processing time (Scrambler, spreader, LPF,

Power control)

Page 50: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 50

Parallelism Most heavy workload algorithms have significant vector parallelism Data width of most operation is 8 bit

Page 51: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 51

Memory Access Pattern

Huge memory is not required Traffic between algorithm is not dominant Access rate of scratch pad memory is very high.

Page 52: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 52

Instruction Breakdown

0

0.2

0.4

0.6

0.8

1

1.2

Searc

her

Inter

leave

Deinte

rleav

er

Viterb

i Enc

oder

Viterb

i dec

oder

Turbo

enco

der

Turbo

deco

der

Scram

bler

Descr

amble

r

Scram

bling

code

(Tx)

Scram

bling

code

(Rx)

Sprea

der

Despr

eade

r

Combin

er

LPF (R

x)

LPF (T

x)

Power

cont

rol

Avera

ge

Ins

tru

cti

on

ty

pe

pe

rce

nta

ge Others

Branch (Others)

BRANCH (IF)

BRANCH (IB)

ST

LD

LOGIC

MUL/DIV

ADD/SUB

ADD/SUB are dominant instruction Multiplication is not dominant in heavy workloads

Page 53: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 53

Frequent Computations

Most multiplications are simplified into cheaper operations Multiplication in LPF-Rx can not be simplified because both

operands are 16bit integer number.

Page 54: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Architectural Implications

Page 55: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 55

Architectural Implications

SIMD because We can exploit vector

parallelism in W-CDMA algorithms

Highly power efficiency can be achieved by sharing control logic between datapath elements.

Chip multiprocessor because There exist substantial

algorithm level parallelism There exist many tiny

sequential algorithms Multiple SIMD + Scalar

SIMD SIMD SIMD….

Scalar

Interconnection Network

Page 56: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 56

Architectural Implications Memory structure

Cache free Memory access pattern exhibits very dense spatial locality.

Small data memory (<64K) Small instruction memory (<4K)

Simple interconnection network Low inter-processor communication is possible by

algorithm level task mapping on each PE.

Page 57: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 57

Architectural Implication

Power management Large workload variation according to operation state

and radio channel condition change. Various power management schemes can be applied

DVS, DFS, Clock gating. Idle mode power must be minimized because it

dominates terminal standby time.

Page 58: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 58

W-CDMA benchmark suite

C based implementation of W-CDMA physical layer operation.

Used for the workload characterization done in this paper.

Available at www.eecs.umich.edu/~sdrg

Page 59: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 59

Conclusion We discussed :

what is SDR and why it is challenging topic for embedded system.

the evolution history of wireless protocols and what are the core technologies of emerging protocols.

We analyzed : the workload characteristic of W-CDMA protocol and

its architectural implication.

Page 60: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Backup Slides

Page 61: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 61

Viterbi Algorithms –Trellis Diagram Viterbi algorithm is based on trellis diagram. Trellis diagram represents all possible state transition of encoder.

< Example of trellis diagram and corresponding convolutional encoder>

00

01

10

11

00

01

10

11

0

1

00

01

10

11

0

1

1

0

00

01

10

11

0

1

0

11

0

1

0

00

01

10

11

00

01

10

11

…x[n]

y[n]

01

0

11

0

1

0

0

1

0

11

0

1

0

1: State transition by input 0 and corresponding output

1: State transition by input 1 and corresponding output

Page 62: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 62

Viterbi Algorithm - BMC BMC (Branch metric calculation) operation is to compute difference

between the received sequence r and outputs of trellis diagram.

BMCi,j = distance(rij, oij)=abs(rij, oij)

oij : output of state transition form i to j

rij : corresponding received sequence

00

01

10

11

0, BMC=0

1, BMC=1

00

01

10

11

0,BMC=1

1,BMC=0

1,BMC=0

0, BMC=1

00

01

10

11

r = 0 1 …

...

All BMC operation in a trellis diagram can be done in parallel.

distance between r(01) and Cn(10) = 1 + 1 = 2

Cn

Page 63: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 63

Viterbi Algorithm - ACS ACS(Add Compare Select) operation is:

BMC i,k

BMCj,k

i

j

kACSi

ACSj

ACSk=min(ACSi+BMCi,k, ACSj+BMCj,k)

This procedure is equivalent to finding a local optimal code sequence. If C1 has smallest ACS value at node state i, then the ACS values of C2 and C3

are always greater than that of C1

AddCompare, Select

C1

C2

C3

i

Page 64: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 64

Viterbi Algorithm - TB Trace back a code sequence which is most close to the received sequence Sequential algorithm

00

01

10

11

00

01

10

11

0

1

00

01

10

11

1

0

1

2

00

01

10

11

2

10

1

00

01

10

11

00

01

10

11

0

12

1

r = 0 1 1 0 0

0

1

2

1

1) find a node has smallest ACS value (00 at this

example)

Decoded result = 0 1 0 0 0

2) Trace back from node 00

Page 65: Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian.

Advanced Computer Architecture LaboratoryUniversity of Michigan 65

Block Interleaver/Deinterleaver

b0 b1 b2 …. b(M-1)

bM b(M+1) b(M+2) …. b(2M-1)

.

.

.b((L/M-1)*M) b(1+(L/M-1)*M) b(2+(L/M-1)*M) …. b(L-1)

b0b1b2...b(L-1)

write

read

b0bM…b1b(M+1)…b(M-1)b(2M-1)...b(L-1)

Interleaver Write row by row sequentially read column by column according to the predefined permutation pattern

Deinterlever Write column by column according to the predefined permutation pattern read row by row sequentially

<interleaving procedure>


Recommended