Design and Implementation of Turbo Decoder for 4G standards IEEE 802.16e and LTE Syed Z. Gilani.

Design and Implementation of Turbo Decoder for 4G standards IEEE 802.16e and LTE

Syed Z. Gilani

Motivation

• Conventional serial decoding architectures can be performance bottleneck– 6144 bit block, 8 iterations @ 250MHz, 1 bit

processed per cycle=> data rate < 6144/ (6144*8*4ns)

– ~ 31Mbps

• Data rates for LTE can be 100Mbps-300Mbps• Parallel architecture necessary to support high

throughput decoding

• Maximum-a posteriori (MAP) algorithm

– Alpha– Beta– Gamma

– LLR

• (De)Interleaver P(i) = (f1*i + f2*i2 )mod N

switch (i mod 4) case 0: P(i) = (P0*i + 1 )mod N case 1: P(i) = (P0*i + 1 + N/2 + P1 )mod N

case 2: P(i) = (P0*i + 1 + P2 )mod N case 3: P(i) = (P0*i + 1 +N/2 + P3 )mod N

Turbo Decoder Overview

Optimizations

• Resource Sharing• Retiming• Look-ahead transformation• Variable and adaptive parallelism• Multiplierless interleaver

Parallelization

Time (cycles)

Stat

es

PE 1PE 1

PE 2PE 2

PE 3PE 3

PE 4PE 4

Variable Parallelization

Parallel Interleaver

Bank 0

Bank 0

Bank 1

Bank 1

Bank 0

Bank 0

Bank 1

Bank 1

Coded Bits Decoded Bits

Variable Parallelization

Parallel Interleaver

Bank 0

Bank 0

Bank 3

Bank 3

Bank 1

Bank 1

Bank 2

Bank 2

Bank 0

Bank 0

Bank 3

Bank 3

Bank 1

Bank 1

Bank 2

Bank 2

Coded Bits Decoded Bits

Interleaver Optimization• Interleaving functions

– P(i) = (f1*i + f2*i2 )mod N

– switch (i mod 4) case 0: P(i) = (P0*i + 1 )mod N

case 1: P(i) = (P0*i + 1 + N/2 + P1 )mod N

case 2: P(i) = (P0*i + 1 + P2 )mod N

case 3: P(i) = (P0*i + 1 +N/2 + P3 )mod N

• Unoptimized Memory requirements– Don’t want to use multipliers and dividers– Storing all memory address in RAM– LTE alone supprts 40 different block lengths with different

interleaving parameters– Block lengths vary from 40 bits to 6144 bits

Interleaver Optimization• On-the-fly address generation• LTE Interleaving Function

P(i) = (f1*i + f2*i2 )mod N

P(i+1) = (f1*(i+1) + f2*(i+1)2)mod N= P(i) +( f1 + f2 +2 f2)mod N

• Wimax Interleaving Function switch (i mod 4)

case 0: P(i) = (P0*i + 1 )mod N

case 1: P(i) = (P0*i + 1 + N/2 + P1 )mod N

case 2: P(i) = (P0*i + 1 + P2 )mod N

case 3: P(i) = (P0*i + 1 +N/2 + P3 )mod N

– P(i+1) = (P0 (i) + P0 + constant factor )mod N

• Replace sum by residue whenever sum exceeds N to avoid mod N (subtraction)

Interleaver OptimizationPE i P(i) Bank

Add.

Bit

Add.0 0 1 0 1

1 300 1501 5 1

2 600 601 2 1

3 900 2101 7 1

4 1200 1201 4 1

5 1500 301 1 1

6 1800 1801 6 1

7 2100 901 3 1

PE i P(i) Bank

Add.

Bit

Add.0 1 1320 4 120

1 301 420 1 120

2 601 1920 6 120

3 901 1020 3 120

4 1201 120 0 120

5 1501 1620 5 120

6 1801 720 2 120

7 2101 2220 7 120

Lookahead Transformation

0

1

6

7

0 0

1

2

3

4

5

6

7

0

tk tk+1 tk tk+2

•16 Comparisons required for lookahead transformation in Duo-binary Wimax turbo codes•Increases throughput by 2x•Maximum clock rate decreases from 500MHz to ~300MHz along with significant increase in area

Results

No of Iterations Number of PEs Throughput Serial throughput2 2 490Mbps 243Mbps2 4 909Mbps 243Mbps2 8 1666Mbps 243Mbps4 2 245Mbps 122Mbps4 4 455Mbps 122Mbps4 8 833Mbps 122Mbps8 2 122Mbps 60Mbps8 4 228Mbps 60Mbps8 8 417Mbps 60Mbps

@ 500Mhz

Questions

Outline

• Motivation• Turbo Encoding• Turbo Decoding• Optimizations– Look-ahead transformation– Variable and adaptive parallelism– Multiplierless interleaver

• Results• Summary

Turbo Encoder

LTE Turbo Encoding Wimax Turbo Encoding

Parallelization

•Example 4 state trellis•1 decoded symbol per cycle

Time (cycles)

Stat

es

Date post:	29-Jan-2016
Category:	Documents
Upload:	arron-walker
View:	221 times
Download:	0 times

Design and Implementation of Turbo Decoder for 4G standards IEEE 802.16e and LTE Syed Z. Gilani.

Documents