Speech Compression Using Wavelet
TechniquePresented by:
Nitesh Mahto 2SD10EC068 S. Srinidhi 2SD10EC090 Sandeep Kumar 2SD10EC092 Vivek Kumar 2SD10EC121
Project Guide: Sharada C. Sajjan
Introduction Speech Compression Techniques involved in speech compression µ-Law A-Law Wavelet Compression Technique Output Conclusion Bibliography
Content
Compression?
Goal of compression:
1. To reduce bandwidth,
2. Make decoded signal sound as close as possible to original
3. Lowest implementation complexity
4. Robust and scalable
5. Reduces number of bits.
Introduction
Nonlinear frequency response of speech: Frequency range approximately from 20 Hz to 20 kHz.
20 Hz to 20 kHz can be broken up into critical bandwidths, which are non-uniform, non-linear, and dependent on the level of the incoming sound
Masking property: strong signal overlapping weak signal
Masking effect can be observed in time and frequency domain.
Speech Properties
The idea of speech compression
Speech Compression
to encode speech data to take up
less storage space & less bandwidth for transmission
Compression Category
Lossless Lossy
Lossless Compression: used where the original and the decompressed data can be identical.
Lossless compression is typically required for text and data files, such as bank records and text articles
Lossy Compression: Eliminates repeated or "unnecessary" pieces of data.
Can't be used to compress anything that needs to be reproduced exactly
Used to compress multimedia data (audio, video, and still images), especially in applications such as streaming media and internet telephony.
Cont.…
To reduce the number of bits required to encode each speech data.
First used in the United States and Japan in the telephone.
Higher amplitudes of signals are compressed before ADC and expanded after DAC.
Quantization error is uniformly distributed.
8
µ-law compression
A-Law is a standard companding algorithm. It is similar to the μ-law algorithm. For a given input x, the equation for A-law
encoding is as follows:
A-law expansion is given by the inverse function:
A-Law Compression
17
Difference between A-Law & µ-Law Compression
A-Law Compression µ-Law Compression A-Law has lesser dynamic
range of the output. A-Law is being used in
Europe. A-law has less distortion. A-law takes precedence
over µ-law with international calls.
µ-law has a larger dynamic range of the output.
µ-law is currently being used by companies in North America and in Japan.
µ-Law has worse distortion with small signals compared to A-law
Signal can be expressed as the sum of a, possibly infinite, series of sines and cosines.
Mathematical transformation employed to transform signals between time domain and frequency domain .
Able to determine all the frequencies present in a signal.
Fourier Transform
Can provide only frequency resolution but no time resolution.
Not useful for analyzing time-variant, non-stationary signals.
Not efficient for representing discontinuities or sharp corners.
Provides excellent localization in the frequency domain but poor localization in the time domain.
Limitations of Fourier Transforms
WHY WAVELET ? Here, we can get both time resolution as well as
frequency resolution by scalable and translation version of mother wavelet.
NOW, What is WAVELET?
-- A wavelet is a “small wave” of effectively limited duration that has an average value of zero.
Wavelet analysis produces a time-scale view of a signal.
Shifting means to delay or hasten its starting point.
Scaling & Translation
Wavelet Decomposition(Daubechies)• Approximations: High scale , Low frequency component of
the signal.• Details: Low scale, high frequency components of the signal.
Single Level Decomposition
S HPF
LPF 2
2
Signal
Approximation coefficient
Detailcoefficient
filters
Implementation
Flow chart Start
Read a wave file
Decomposition(Wavedec)
Thresholding and
compression
Store/ Transmit
Reconstruct(Waverec)
Convert into wave file
End
Level Compression Ratio in % SNR
1. 43.56 10.36
2. 48.10 10.35
3. 64.91 10.34
4. 65.81 10.33
5. 65.75 10.31
Compression parameters1) Differing levels
2) Differing DB’sDB Compression Ratio in % SNR
1. 52.04 10.20
5. 65.52 10.33
10. 65.75 10.34
In this project, we have studied different compression technique mainly µ-law, A-law and wavelet transform.
We have implemented by taking sine wave and then speech signal as input.
Then we have plotted original signal, compressed signal, decompressed signal and error signal.
As we are mainly concerned on wavelet transform we found that as the number of levels increases, compression ratio increases but SNR decreases, it shows that no. of level should be more for better compression.
As we go for higher members of DB, SNR and compression both increases.
Conclusion
[1] D. Sinha and A. Tewfik. “Low Bit Rate Transparent Audio Compression using Adapted Wavelets”, IEEE Trans. ASSP, Vol. 41, No. 12, December 1993.
[2] P. Srinivasan and L. H. Jamieson. “High Quality Audio Compression Using an Adaptive Wavelet Packet
Decomposition and Psychoacoustic Modeling”, IEEE Transactions on Signal Processing, Vol 46, No. 4, April 1998.
[3] J.I. Agbinya, “Discrete Wavelet Transform Techniques in Speech Processing”, IEEE Tencon Digital
Signal Processing Applications Proceedings, IEEE, New York, NY, 1996, pp 514-519. [4] Ken C. Pohlmann “Principles of Digital Audio”, McGraw-Hill, Fourth edition, 2000. [5] X. Huang, A. Acero & H-W. Hon “Spoken Language Processing: A Guide to Theory, Algorithm and
System Development”, Pearson Education, 1st edition 2001. [6] S.G. Mallat. "A Wavelet Tour of Signal Processing." 2nd Edition. Academic Press, 1999. ISBN 0-12-
466606-X [7] J.G. Proakis and D.G. Manolakis, Digital Signal Processing: Principles, Algorithms, and Applications,
Prentice-Hall, NJ, Third Edition, 1996. [8] Mathworks, Student Edition of MATLAB, Version 6.5, Prentice-Hall, NJ.
Bibliography