Date post: | 14-Apr-2018 |
Category: |
Documents |
Upload: | senthil-kumar |
View: | 230 times |
Download: | 0 times |
of 70
7/30/2019 23Speech v104 Vg
1/70
DSP C5000
Chapter 23
Mobile CommunicationSpeech Coders
Copyright 2003 Texas Instruments. All rights reserved.
7/30/2019 23Speech v104 Vg
2/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 2
Outline
Speech Coding, CELP Coders
Implementation using C54x
7/30/2019 23Speech v104 Vg
3/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 3
OutlineSpeech Coding
Generalities on speech and coding
Linear Prediction based coders
Short term and long term prediction
Vector Quantization
CELP coders
Structure and calculations
Standards
7/30/2019 23Speech v104 Vg
4/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 4
Applications of Speech Coding
Digital Transmissions
On wired telephone:
Multiplexing
Integration of services
On wireless channels:
Spectral efficiency
For better protection against errors
Voice mail/messaging Storage: telephone answering machine
Secure phone
7/30/2019 23Speech v104 Vg
5/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 5
Characteristics of Coders
Bit Rate D: 50 bps < D < 96 kbps
Coding Delay ~ frame delay
Quality
Objective measurements: SNR, PSQM
Subjective measurements: MOS(excellent,good,fair,poor,unacceptable)
Intelligibility:
Objective measure STI or subjective DRT
Acceptability: E model of ETSI standard,communicability
Immunity to noise
Complexity
7/30/2019 23Speech v104 Vg
6/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 6
Objective Evaluation of the Quality
The PSQM method:
Objective evaluation
Based on a model of auditive perception
Takes into account the masking effects
Good correlation with the MOS grade in basic conditions:
Low bit rate speech coding, tandem, transmissionerrors, ...
But sometimes not very reliable :
Loss of frames, effect of the automatic control
Still under development (PSQM+)
7/30/2019 23Speech v104 Vg
7/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 7
Subjective Evaluation of Qualityusing the ACR Method yielding MOS score
A great number of auditors give grades to agreat number of speech sequences.
Database with phonetically balanced sentences
Presentation in random order
Naive auditors
Statistical processing of results gives the MOS.
MOS = Mean OpinionScore
Bad
Mediocre
Average
Good
Excellent
1
2
3
4
5
7/30/2019 23Speech v104 Vg
8/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 8
Speech Production
Diaphragm
Lungs Larynx
Vocal chords
Nasal Cavity
Mouth
Palate
Tongue
Lips
Jaws
Vocal tract
Wind Pipe
Art icu lators
Wind Tunnel
Osci l lator
7/30/2019 23Speech v104 Vg
9/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 9
Speech Signal
7/30/2019 23Speech v104 Vg
10/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 10
Speech Spectrum for a Voiced Sound
7/30/2019 23Speech v104 Vg
11/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 11
Speech Spectrogram
Non stationary
Voiced / unvoiced
7/30/2019 23Speech v104 Vg
12/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 12
Calculation of Spectrograms
Preac = Preaccentuation, enhances high freqeuncies
Window = limits the edge effects
Preac Window FFT Log(| |)
Powerspectral
density
Speech
signal
Time
Example of window
7/30/2019 23Speech v104 Vg
13/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 13
Example: Time Signal and Spectrogram
1 2 3 4 5 6 7 80
500
1000
1500
2000
2500
3000
3500
0 1 2 3 4 5 6 7 8 9-3
-2
-1
0
1
2
3x 10
Time
Frequency
Time
SPECTROGRAM
TIME SIGNAL
7/30/2019 23Speech v104 Vg
14/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 14
Equivalent Electrical Model
NV
V
Stochastic
excitation
Periodical
excitation
Speech
Spectrum
shaping filter
f
Transfer functionof the vocal tract
Voicing
Gain
7/30/2019 23Speech v104 Vg
15/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 15
Simplified Speech Production Model
y(t)=h(t)*e(t) - Y(z)= H(z)E(z)
V z e z e zc jb T
k
Kc jb T k k k k ( ) / ( )( )( ) ( )
1 1 11
1
1
L z z( )
1
1
G z e z cT( ) / ( ) 1 1 1 2
G(z) V(z) L(z)
glott isradiation
at the l ips
vocal
tract yy((tt))ee((tt))
7/30/2019 23Speech v104 Vg
16/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 16
All Pole Model of theSpectrum Shaping Filter
The filter H(z) represents the spectralenvelope since the excitation has a whitespectrum.
1
1 1( ) ( ) ( ) ( ) .
( )1
pi
i
i
H z G z V z L z A z
a z
7/30/2019 23Speech v104 Vg
17/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 17
Short Term Linear Prediction
The coefficients of H(z)=1/A(z) can be
obtained by linear prediction. Short term analysis on x(n)speech signal
Frames of 10 to 30 ms.
Least square error criterion:
1
2
( ) ( ) : linear prediction.
( ) ( ) ( ). criterion: min ( ).
p
i
i
n
x n a x n i
e n x n x n e n
7/30/2019 23Speech v104 Vg
18/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 18
Determination of the Spectral Envelope byLinear Prediction
Prediction error e(n) = residual is nearlywhite,
so the spectral envelope of x(n) can beapproximated by Sx(f):
2
2
2
( ) ,( )
is the least square prediction error.
xS fA f
7/30/2019 23Speech v104 Vg
19/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 19
Calculation of the Prediction Coeffcients
The prediction coefficients aiare the
solution of the normal equations:
1,1 1, 1 0,1
,1 , 0,
,
.
( ) ( ).
p
p p p p p
i j
r r a r
r r a r
r x n i x n j
The Levinson Durbin algorithm is often
used to solve these equations
7/30/2019 23Speech v104 Vg
20/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 20
Example of Linear Prediction
Amplitude ofthe speechsignal
Amplitude ofresidual signal
E l f Li P di ti S t l
7/30/2019 23Speech v104 Vg
21/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 21
Example of Linear Prediction: SpectralEnvelope Estimation
Formants
7/30/2019 23Speech v104 Vg
22/70Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 22
Estimation of the Pitch Period
Pitch Period T0 estimated by correlation
of the speech signal or residual. Other methods exist (e.g. cepstrum)
F0 = fundamental frequency = 1/T0
Fractional pitch estimation if the precisionis better than the sampling period.
0
0
50 400
120 160 f 8000 .S
S
Hz F Hz
T kT k i Fs HzT
7/30/2019 23Speech v104 Vg
23/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 23
Long Term Prediction (LTP)
The idea is to predict one period of
signal from the preceding one:
( ) ( ).x n b x n M
2 unknowns: b and M.
M is the pitch period (when voiced).
Least square error criterion is used.
7/30/2019 23Speech v104 Vg
24/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 24
Long Term Prediction (LTP)
2
n n M n M
n n
b x x x
2
2
n Mn n M
n n
x x x
For a given value of M, optimal b is:
The best M value maximizes:
All possible values of M must be tested.
7/30/2019 23Speech v104 Vg
25/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 25
Example of Long Term Prediction
7/30/2019 23Speech v104 Vg
26/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 26
LPC 10 Vocoder
One of the oldest speech coder is the
LPC10 vocoder:
The analysis (coder) calculates eachframe:
Pitch period, predictioncoefficients, energy, voicing.
The synthesis (decoder) uses theseparameters to synthesize speech fromthe electrical equivalent model.
7/30/2019 23Speech v104 Vg
27/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 27
LPC 10 Vocoder (Order 10)
Speech
frameLinear
Predict ion
pitch/voicing
energy
mux
(ai)
E
excitat ion
(600 bps )
Spectrum
sha ping fi lter(1800 bps )
V/UV,F
CODER
V/UV,F0
E(ai)
1/A(z)Gain
Synthet icspeech
Decoder
Frame= 22,5 ms
i i S
7/30/2019 23Speech v104 Vg
28/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 28
Prediction Spectral Parameters The aicoefficients are sensitive to coding and
interpolation.
They are replaced by other coefficients:
Reflexion coefficients ki, log area ratio LARi.
Line spectrum frequencies LSFi.
In the LPC10 vocoder
The pitch and voicing are coded on 7 bits
The log of energy on 5 bits
The 10 prediction coefficients ai (transformed inki and LARi) are coded on 41 bits.
A total of 53 bits per frame of 22,5ms = 2400bps
Vector Quantization
7/30/2019 23Speech v104 Vg
29/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 29
Vector Quantization(2-dimensional example )
k1
k2
-1
-1
1
1
k1
k2
-1
-1
1
1
**
*
* *
sca lar q u a n t i za t i o n
. .....
.. ...
. .....
.. .... ....
.
...k1
k2
-1
-1
1
1..
...
. ..
. .
i n pu t v ect o r s (t r a i n i n g base)
...
.
. ..
vect o r qua n t i za t i on
t h e i n dex o f th e cod e
vec t or i s t r an sm i t t ed
Code
Bit rate can be decreased by applying VQ to the coefficients.
Li S F i LSF LSP
7/30/2019 23Speech v104 Vg
30/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 30
Line Spectrum Frequencies LSF, LSP
The Line Spectrum Frequencies fi and
Line spectrum pairs cos(fi) have goodproperties for quantization andinterpolation.
The LSF and LSP are derived from the
inverse filter A(z). Build F1(z) and F2(z) symetrical and
antisymmetrical polynomials by (for order10):
11 1 11
11 1 1
2
F ( ) ( ) ( ) /(1 ).
F ( ) ( ) ( ) /(1 ).
z A z z A z z
z A z z A z z
LSF d LSP
7/30/2019 23Speech v104 Vg
31/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 31
LSF and LSP
Roots of F1 and F2 on lie on the unit
circle and are interleaved. 5 conjugate roots exp(ji), fi= i/(2).
1 2
11,3,...,9
1 2
2
2,4,...,10
F ( ) 1 2 .
F ( ) 1 2 .
cos cos 2 . = LSP, = LSP.
ik
i
k
i i i i i
z q z z
z q z z
q f q f
Coders using Short Term and Long Term
7/30/2019 23Speech v104 Vg
32/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 32
Coders using Short Term and Long TermPrediction RELP MPE LPCELP
x(n) e(n) r(n)
Calculation
ofri ,jand ai
Calculation
ofband M
Coding of
r nu nB(z)A(z)Speechsignal
residualsignal
Analysis
Quantizationof coefficients
b, M, ai
Quantizedcoefficients
Quantized
residual
u(n)
1/A(z)1/B(z)
Synthetic
speech
signal
Synthesis
Quantized
coefficients
Quantized
residual
RPE LTP GSM F ll R t C d
7/30/2019 23Speech v104 Vg
33/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 33
RPE-LTP GSM Full Rate Coders
GSM Full Rate Coder is called:
RPE LTP= Regular Pulse Excited, LongTerm Prediction coder
The signal u = the best down-sampledversion ( 4) of the residual signal r.
In CELP coders, vector quantization isapplied on the signal.
CELP = Code Excited Linear Prediction
coder Each frame of residual signal is compared
to sequences of signal stored in a codebook.The codebook sequences are white and the
codebook is called stochastic codebook.
CELP C d B i S h
7/30/2019 23Speech v104 Vg
34/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 34
CELP Coder Basic Scheme
Analysis by synthesis (closed loop) to
find the best excitation sequence.
NormalizedStochasticCodebook
1/A(z)1/B(z)
g-
LScriterion
q
Original
speech signal x
Synthetic
Speech signal
7/30/2019 23Speech v104 Vg
35/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 35
Structure of CELP Coder: Perceptual Filter
Perceptual filter: the reconstruction erroris spectrally weighted exploiting noisemasking properties of formants.
W(z)=A(z/1)/A(z/ 2), 0 1, 2 1 A*(z)=A(z/) (poles towards zero)
NormalizedStochasticCodebook
1/A(z)1/B(z)
gain-
LScriterion
Error q
Original
speech signal x
Synthetic
Speech signal
W(z)
CELP C d ith P t l Filt
7/30/2019 23Speech v104 Vg
36/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 36
CELP Coder with Perceptual Filter
Original Signal
perceptualCriterion
1/A(z)
LPC Analysis
Syntheticspeech
Iteration on the whole codebook
Gain g
Waveformscodebook
01
2
M-
1/B(z)
Pitch estimationana-synt
Search for the bestcode and gain
W(z) MC
s
s
0 50010001500200025003000350001234567
8910
W(f)
1/A(f)
Basic CELP Structure: Perceptual Filter
7/30/2019 23Speech v104 Vg
37/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 37
Basic CELP Structure: Perceptual FilterInserted in the 2 Branches
Least
Squarecj
r
Speech frame
H(z)including
W(z)
W(z)
e
+-
s Weighted speech frame
Codebook
1/B(z)
e
p
H(z)=W(z)/A(z)
CELP St t M f H( )
7/30/2019 23Speech v104 Vg
38/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 38
CELP Structure: Memory of H(z)
Memory of H(z) = Output for a zero input
hi= impulse response of H(z)
0
0
0 1 0known
0
0
( )
( ) ( ).
( ) ( ) ( )
( ) ( ) ( ).
i n i
in n
i n i i n i i n i
i i n i
p n h e
p n h e h e h e p n
p n p n p np n s n p n
CELP Coder: Memory of H(z)
7/30/2019 23Speech v104 Vg
39/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 39
CELP Coder: Memory of H(z)
LeastSquare
+ -
cj
r
Speech frame
H(z)including
W(z)
W(z)
pe
+-
p0H(z) Memory
s
1/B(z)
s
Initial conditions
equal to zero
CELP: Adaptive Codebook
7/30/2019 23Speech v104 Vg
40/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 40
CELP: Adaptive Codebook
LTP can be realized by an adaptive codebook
0 ( ) 2 1
0 0
( ) .n n
jn in n i i n i n i M n n
i i
p p he h gc be p p
1
0
2 ( )
0
.
.
n
n i n i M
i
nj
n i n i
i
p b h e
p g h c
CELP with Stochastic Codebook
7/30/2019 23Speech v104 Vg
41/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 41
CELP with Stochastic Codebook
Least
Square
+ -
g1
g2c1,i(0)
c2,i(1)
Speech frame
pH(z)
W(z)
p e+-
p0H filter memory
s
H(z)
1
2
Adaptive
codebook
Stochastic
codebook
The adaptive codebook stores the past residual frames. It is calledadaptive because its content changes with time.
CELP Decoder
7/30/2019 23Speech v104 Vg
42/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 42
CELP Decoder
g1
g2
c1,i(0)
c2,i(1)
Synthetic speech1/A(z)
Adaptive
codebook
Stochastic
codebook
Decode received parameters:
Index of stochastic codebook
Gain of stochastic codebookIndex of adaptive codebook
Gain of adaptive codebook
Linear prediction filter coefficients
CELP E ti
7/30/2019 23Speech v104 Vg
43/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 43
CELP EquationsExample: Searching through Codebooks
The main load is the filtering of all thecodebook vectors.
c,i(j)H(z)
Nul Initial
conditions
f,i(j)
jth Filtered
codebook
jth Original
codebook
, ( ) , ( ) , ( )
, ( ) , ( ) , ( ) , ( )
0
. ( ) ( )* ( ).
( ) ( ) ( ). .
j j i j j i j j i j
n
j i j j i j j i j j i j
k
g f n h n c n
f n h k c n k
jp f
f Hc
Filtering Matrix H
7/30/2019 23Speech v104 Vg
44/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 44
Filtering Matrix H
0
1 0
2 1 0
1 2 1 0
0 0
0
0
0
N
h
h h
H h h h
h h h h
H(n) is the impulse responsecorresponding to H(z).
N = length of the codebook vectors.
Finding the Best Excitation in the Coder:
7/30/2019 23Speech v104 Vg
45/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 45
Finding the Best Excitation in the Coder:Equation of the Solution
J least square criterion For a set of 2 vectors cj,i (j ), F is the 2
column matrix of filtered vectors fj,i (j )
2 2
2 1
Optimal gain:
min max ( )
T T
T T T
J p p p Fg
F F g F p
J p p F F F F p
CELP Optimal Solution
7/30/2019 23Speech v104 Vg
46/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 46
CELP Optimal Solution
Optimal algorithm finds the best
combination of code vectors maximizingthe norm and finds the optimal gains gj.
But the number of combinations ofcodebook vectors is very high and thecomplexity is also great. Example:
M=1024 for the stochastic codebook
and M=256 for the adaptive codebook
Leads to 262 144 solutions to test and
1280 vectors to filter.
Iterative Suboptimal Algorithm for 2
7/30/2019 23Speech v104 Vg
47/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 47
Iterative Suboptimal Algorithm for 2Codebooks
First step:
Target vector = p
Find the best vector in the adaptivecodebook and its gain.
Calculate the new target vector p1:1
1
n np p p Second step:
Target vector = p1
Find the best vector in the stochasticcodebook and its gain.
Iterative Algorithm
7/30/2019 23Speech v104 Vg
48/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 48
Iterative Algorithm
-
H(z)
W(z)
p 2g 2
^
^
p 0
Speech
c(n,j)
,g 1
H(z) p 1
g 1 ^
e(n-M)
LS
M,g 2
g 1
g 2
p
p 1
LS
Operations of the Iterative Algorithm
7/30/2019 23Speech v104 Vg
49/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 49
Operations of the Iterative Algorithm
At step j, the optimal codebook vector
has index i:2 2
,argmin =j j j j j k
ki g p p p f
, ,
opt,
, , , ,
.
T T
j j k j j k
k T T
j k j k j k j k
g T
p f p Hc
f f c H Hc
22
opt,
, ,
arg max with
T
j j k kkT Tk
j k j k k k
i g
p Hc
c H Hc
Iterative Algorithm
7/30/2019 23Speech v104 Vg
50/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 50
Iterative Algorithm
Numerical Example :
FS=8000Hz,
M=256 size of the stochastic codebook
Ma=128 size of the adaptive codebook
Frame size NT=160, 20ms
Frames split in 4 subframes of N=40 samples
p=10 linear prediction order
10 Mips to filter the stochastic codebook.
Iterative Algorithm
7/30/2019 23Speech v104 Vg
51/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 51
Iterative Algorithm
The main processing load is the filtering
of the codebooks vectors. Many algorithms have been proposed to
decrease the computation load:
Special structures of the codebook: VSELP: Vector Sum
Algebraic codebook: ACELP
Linear codebook (the adaptive codebook is
linear). Structure of H avoiding the filtering:
Diagonalization of HTH
CELP Coding
7/30/2019 23Speech v104 Vg
52/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 52
CELP Coding
Standards from 4.8 kbps to 16 kbps
Federal standard (DOD) (4.8 kbps) frame = 260 samples (30 ms)
LPC 8 --> (LSP coding 34 bits)
adaptive codebook (256 vectors (fractionalpitch))
stochastic codebook (512 vectors (-1,0,1))
VSELP (Vector Sum Excitation Coding)
7/30/2019 23Speech v104 Vg
53/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 53
VSELP (Vector Sum Excitation Coding)
Codebook vectors v are combinations of
basis vectors (b1,b2,...,bk) v=+/- b1 +/- b2 +/- ... +/- bk
Only the basis vectors are filtered
Motorola ( 8 kbps)
GSM (half rate)(5.6 kbps)
Fractional Pitch
7/30/2019 23Speech v104 Vg
54/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 54
Fractional Pitch
The precision of the pitch period is a fraction ofsample TS. An interpolation filter is used.
B(z)=1-bz-Mf with Mf=M+ x(n-M-) can be written as:
TF-1(X(f)*e(-j2f(M+)Te)) = x(n-M)* TF-1(e(-j2fTe)) =x(n-M)* h(n)
h(n)s inc((n-))
+Fe/2-Fe /2f
| H(f)|
Interpolation filter H
0 2 4 6 8 10 12 14-0.5
0
0.5
1
=1/4 TS
Time in TS
Standards
7/30/2019 23Speech v104 Vg
55/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 55
Standards
Standards
7/30/2019 23Speech v104 Vg
56/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 56
Standards
Wired Telephony UIT-T
G 711 (1972) : PCM 64 kbps G 721 (1984) : ADPCM 32 kbps
G 728 (1991) : LD_CELP 16 kbps
G 729 : CS-ACELP 8 kbps
Mobile Communications (ETSI - CTIA) GSM (FR ) :RPE_LTP 13 kbps
GSM (HR) :VSELP 5.6 kbps
GSM (EFR) :ACELP 12.2 kbps
UMTS (AMR) :ACELP 12.2 to 4.75 Kbps Military applications (NATO)
FS 1015 (1976) : LPC10 2.4 kbps
FS 1016 (1991) : CELP 4.8 kbps
AMR Adaptive MultiRate Coder
7/30/2019 23Speech v104 Vg
57/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 57
pfor 3G Application
8 Narrow Band NB AMR source coders 12.2 10.2 7.95 7.40 6.70 5.90 5.15 4.75 kbps
9 Wide Band coders WB AMR coders
Based on ACELP
Frame of 20 ms, fs=8000 Hz
IUT Civil Standards
7/30/2019 23Speech v104 Vg
58/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 58
IUT Civil Standards
UITStandard
Method Year Bit rate inKbps
Delay in ms QualityMOS
Complexityin Mips
G711 PCM 1972 64 0.125 4.3
7/30/2019 23Speech v104 Vg
59/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 59
ETSI and Inmarsat Standards
Inmarsat
StandardMethod Year
Bit rate in
KbpsDelay in ms
Quality
MOS
Complexity
in Mips
IMBE IMBE 1990 4.15 78.75 3.4 7
Mini-M AMBE 1995 3.6 3.6 25
Standard
ETSIeurope
Method Year
Bit rate in
Kbps Delay in ms
Quality
MOS
Complexity
in Mips
GSM RPE-LTP 1987 13 40 3.47 6
TETRA ACELP 1994 4.567 67.5 12
GSM HR VSELP 1994 5.6 45 30
EFR-GSM
(identicalDCS)
ACELP 1995 12.8 40 15
TIA and RCR Standards
7/30/2019 23Speech v104 Vg
60/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 60
TIA and RCR Standards
TIA
standard
USA
Method Year Bit rate in
KbpsDelay in ms
Quality
MOS
Complexity
in Mips
IS54 VSELP 1989 7.95 45 3.5 20
IS96 QCELP 1993 8.5-4.2-0.8 45 4 20
IS136 ACELP 1995 7.4 45
IS127 EVRC RCELP 1995 8.5-4-0.8 >45 20
PCS 1900 ACELP 1995 40 15
TIA
standard
USA
Method Year Bit rate in
KbpsDelay in ms
Quality
MOS
Complexity
in Mips
RCR STD-
27BVSELP 1990 6.7 45 20
JDC HR PSI-CELP 1993 3.45 90 50
Implementation of CELP Coders on C54x
7/30/2019 23Speech v104 Vg
61/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 61
Implementation of CELP Coders on C54x
Example of the G729 Annex A.
Specific instruction for codebook search
Some functions of DSPLIB
Profiling Example for G729 Annex A
7/30/2019 23Speech v104 Vg
62/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 62
g pusing C Compiler
G729 is a CS-ACELP Coder (ITU 1995) 8Kbps with quality of ADPCM at
32Kbps G726.
DSVD: G729 Annex A voice overinternet, voice e-mail
Digital Simultaneous Voice & Data
G729 Annex A
7/30/2019 23Speech v104 Vg
63/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 63
Main Blocks of the Coder Algorithm
Frame = 10 ms = 80 Samples.
Short term LPC analysis on 40ms frame
LSP derived from ai coefficients andquantized using Split VQ.
Long Term LTP analysis, 2 subframesof 40 samples.
LTP lag and gain. LTP fractional lag (1/3)
8 bits 1rst
subframe and 5 bits for the 2nd
. Search fixed codebook: 2 subframes of
40 samples. Index and gains
Code length = 40 with 4 non-zero pulses 1.
Structures of Frames
7/30/2019 23Speech v104 Vg
64/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 64
Previous speech
120
Subframe 1
40
Subframe 2
40
Next frame
40
Present frame, 80 samples = L_Frame
New speech, 80 samples = L_Frame
Total speech vector, 240 samples = L_Total
LPC analysis window, 240 samples = L_Window
G729 Annex A, Bit Allocation
7/30/2019 23Speech v104 Vg
65/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 65
,
Total inbits
18
18
13
1
26
8
1480
1rst subframe(5ms) in bits
2nd subframe(5ms) in bits
LPC filter
pitch period
parity check
Fixed codebook
Pulse positions
Pulse signs
Codebook gains
8
1
7 = 3+4
4
Adaptive codebook:
7 = 3+4
13 = (3x3+4)
5
13 = (3x3+4)
4
G729 Annex A
7/30/2019 23Speech v104 Vg
66/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 66
Main Blocks of the Decoder Algorithm
The serial received bits are converted intoparameters:
LSP vector, 2 fractional pitch lags and gains, 2fixed codebook index and gains.
LSP are converted to LP filter coefficientsai and interpolated at each subframe.
At each subframe:
The excitation is constructed and scaled.
The speech is synthesized by filtering theexcitation by the LP synthesis filter.
Postprocessing by an adaptive postfilter.
Using the C Compiler
7/30/2019 23Speech v104 Vg
67/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 67
g p
Use the C program of the standard and Ccompiler with maximum optimization.
Autocorrelation = 488 445 cycles
Levinson = 164 843 cycles
Conversion ai LSF = 410 404 cycles
LSF Quantization = 883 853 cycles
Synthesis filtering = 501 472 cycles
Pitch open loop = 793 533 cycles
Fractional Pitch = 2 x 618 354 cycles Search Algebraic code = 2x 617 582 cycles
Gains quantization = 2x 108 480 cycles
Assembly Language Instructions for
7/30/2019 23Speech v104 Vg
68/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 68
Codebook Search Better results can be obtained with
assembly language than C. Specific instructions for codebook
search: Conditional stores.
CodeBook Search (Conditional Stores)STRCD Xmem, cond
SRCCD Xmem, cond
SACCD src, Xmem, cond
Xmem = T if condition is true
Xmem = BRC if condition is true
Xmem = src if condition is true
STRCD = Store T Conditionally
SRCCD = Store Block Repeat Counter Conditionally
SACCD = Store Accumulator Conditionally
Assembly Language Codebook Search
7/30/2019 23Speech v104 Vg
69/70
Copyright 2003 Texas Instruments. All rights reserved.ESIEE, Slide 69
y g g
2 2
i
Best code book index
carg max noted find max of
G
opt
iopt
ii i
j
j
222 2
To avoid division:
.opti
i opt opt i
i opt
ccc G c G
G G
Assembly Language Codebook Search
7/30/2019 23Speech v104 Vg
70/70
y g g
.mmregs
.text
CBS: STM #C, AR5
STM #G, AR2STM #G-opt, AR3
STM #I-opt, AR4
ST #0, *AR4
ST #1, *AR3+
ST #0, *AR3-STM #N-1, BRC
RPTBD done
SQUR *AR5+, A
MPYA *AR3+
MAS *AR2+, *AR3-, B
SRCCD *AR4, BGEQSTRCD *AR3+, BGEQ
SACCD A, *AR3-, BGEQ
SQUR *AR5+, Adone: MPYA *AR3+
R5 C(0)
...
R3 Gopt=1
Copt2=0
R4 Iopt=0
R2 G(0)
...
A=C(i)2
B=C(i)2Gopt T=Gopt
B=C(i)2Gopt-G(i)Copt2
If (B 0) then: BRC Gopt T Iopt A C t2