Implementing HD Voice on Mobile Devices
• Global IP Solutions (GIPS)– Recognized leader in world class voice and video
processing technology for IP networks– GIPS software is deployed in over a billion end-points – Enables developers, operators and mobile
manufacturers to offer the highest quality regardless of network conditions
• Roar Hagen– Chief Technical Officer– GIPS– R&D in speech and video processing for more than 20
years– Ph.D. in Information Theory (source coding)
Where is HD Voice Most Needed?
CONFIDENTIAL
3
*Best quality means clean, or perfect conditions
Best possible PSTN Quality*
Best possible Cell Phone Quality*
Hang-up, intolerable speech degradation
• Where audio bandwidth is currently most constrained
• Where intelligibility is most difficult
• Where environmental factors (background noise, echo) further impair quality
• Conferencing/Collaboration
Mobile telephony is the most in need of a quality upgrade
Mobile Quality Issues• Network Deficiencies
– Low signal strength– Wi-Fi bottlenecks– Additional delay, jitter and packet loss in IP network
• Device Limitations– Limited processing power and battery life– OS limiting access to real-time VoIP– Recording and playout
• Environmental disturbances– Acoustic Echo– Background noise
DesignChallenges
Coping with Network Degradation Power Consumption
Hardware Issues (Processor, OS, Acoustics, etc.)
Echo Cancellation
Additional Voice Processing Components
Environment – Background Noise, Room Acoustics,
etc.
Speech Codec
Network
Codec
Hardware
Echo Power
Voice Environment
Design Considerations
Impact of Poor Networks
• Packet Loss– Occurs due to flushed buffers in network nodes– Same effect if packets are too late to be used– Smooth concealment necessary
• Network Jitter– Transmission time differs for each packet– Jitter buffer necessary to ensure continuous playout– Trade-off between delay and quality
• Latency– Major effect is “stepping on each other’s talk”– Long delays make echo more annoying
Enhancing the HD Voice Experience
• Echo cancellation– Speaker and microphone placement can create echo– Hands free even more difficult– Higher audio bandwidth allows more echo– CPU limitations make full duplex difficult
• Noise suppression– Mobile environments inherently noisy– Higher audio bandwidth allows more noise
Sound and Noise
8
CONFIDENTIAL
So Many Codec Options
G.729.1
G.719
G.718
RTAudio
SILK
iPCM-WB
Speex
iSAC
AAC-LDG.722.2 (AMR-WB)
G.722.1 (Siren)
G.722
BV 32
SVOPC G.711.1EVRC-WB
iSAC
G.722
G.711.1
MUSHRA Test Results For Different Codecs
Summary
• Mobile (and conferencing) in most need of HD– But also have the most challenges
• Need overall high quality HD design– Delay, Packet Loss, Jitter handling– Suitable codec designed for IP networks– Echo Cancellation, Noise Suppression, Gain Control– Conference Mixing
• New devices more powerful and open (Android)– Easier to provide high quality– Less need for heavy optimization of complexity