+ All Categories
Home > Documents > Noise and Echo Control for Immersive Voice Communication in Spacesuits

Noise and Echo Control for Immersive Voice Communication in Spacesuits

Date post: 15-Jan-2016
Category:
Upload: brigid
View: 50 times
Download: 0 times
Share this document with a friend
Description:
Presented as a keynote speech on the International Workshop on Acoustic Echo and Noise Control (IWAENC) in Tel Aviv, Israel on September 2, 2010. Noise and Echo Control for Immersive Voice Communication in Spacesuits. Yiteng (Arden) Huang WeVoice, Inc., Bridgewater, New Jersey, USA - PowerPoint PPT Presentation
Popular Tags:
60
Noise and Echo Control for Immersive Voice Communication in Spacesuits 9/2/2010 Yiteng (Arden) Huang WeVoice, Inc., Bridgewater, New Jersey, USA [email protected] Presented as a keynote speech on the International Workshop on Acoustic Echo and Noise Control (IWAENC) in Tel Aviv, Israel on September 2, 2010
Transcript
Page 1: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

Noise and Echo Control forImmersive Voice Communication in Spacesuits

Noise and Echo Control forImmersive Voice Communication in Spacesuits

9/2/2010

Yiteng (Arden) Huang

WeVoice, Inc., Bridgewater, New Jersey, USA

[email protected]

Presented as a keynote speech on the International Workshop on Acoustic Echo and Noise Control (IWAENC) in Tel Aviv, Israelon September 2, 2010

Page 2: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

2 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

About the Project

Financially sponsored by the NASA SBIR (Small Business Innovation Research) program

Phase I feasibility research: Jan. 2008 – July 2008

Phase II prototype development: Jan. 2009 – Jan. 2011

Other team members:

• Jingdong Chen, WeVoice, Inc., Bridgewater, New Jersey, USA

• Scott Sands, NASA Glenn Research Center (GRC), Cleveland, Ohio, USA

• Jacob Benesty, University of Quebec, Montreal, Quebec, Canada

Financially sponsored by the NASA SBIR (Small Business Innovation Research) program

Phase I feasibility research: Jan. 2008 – July 2008

Phase II prototype development: Jan. 2009 – Jan. 2011

Other team members:

• Jingdong Chen, WeVoice, Inc., Bridgewater, New Jersey, USA

• Scott Sands, NASA Glenn Research Center (GRC), Cleveland, Ohio, USA

• Jacob Benesty, University of Quebec, Montreal, Quebec, Canada

Page 3: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

3 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Outline

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

Page 4: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

4 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Section 1

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

Page 5: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

5 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Requirements of In-Suit Audio

Speech Quality and Intelligibility:

90% word identification rate

Hearing Protection:

Limits total noise dose, hazard noise, and on-orbit continuous and impulse noise for waking and sleeping periods

Noise loads are very high during launch and orbital maneuvers.

Audio Control and Interfaces:

Provides manual silencing features and volume controls

Operation at Non-Standard Barometric Pressure Levels (BPLs):

Operates effectively between 30 kPa and 105 kPa

Speech Quality and Intelligibility:

90% word identification rate

Hearing Protection:

Limits total noise dose, hazard noise, and on-orbit continuous and impulse noise for waking and sleeping periods

Noise loads are very high during launch and orbital maneuvers.

Audio Control and Interfaces:

Provides manual silencing features and volume controls

Operation at Non-Standard Barometric Pressure Levels (BPLs):

Operates effectively between 30 kPa and 105 kPa

Page 6: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

6 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Current In-Suit Audio System

Chin Cup

MicrophoneModule

Microphone Boom

Skullcap

PerspirationAbsorptionArea

Helmet

Helmet Ring

Earpiece

Current Solution: Communication Carrier Assembly (CCA) Audio System

Current Solution: Communication Carrier Assembly (CCA) Audio System

Page 7: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

7 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Extravehicular Mobility Unit (EMU) CCA

• For shuttle and International Space

Station (ISS) operations

• For shuttle and International Space

Station (ISS) operations

Source: O. Sands, NASA GRC

Interconnect wiring Nylon/spondex top

Teflon sidepiece and pocket

Electret Microphone

Interface cable and connector

Electret Microphon

e

Ear seal

Ear cup

• A large gain applied to the outbound

speech for sufficient sound volume at

low static pressure levels (30 kPa)

leads to clipping and strong distortion

during operations near sea-level

BPL.

• A large gain applied to the outbound

speech for sufficient sound volume at

low static pressure levels (30 kPa)

leads to clipping and strong distortion

during operations near sea-level

BPL.

Page 8: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

8 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Advanced Crew Escape Suit (ACES) CCA

Source: O. Sands, NASA GRC

Dynamic Microphones

• For shuttle launch and entry operations• For shuttle launch and entry operations

• Hearing protection provided by the ACES

CCA may not be sufficient.

• Hearing protection provided by the ACES

CCA may not be sufficient.

Page 9: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

9 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Developmental CCA

• The active earpieces will be used in conjunction with the CCA ear cups during launch

and other high noise events and can be removed for other suited operations.

• The active earpieces alone nearly provide the required level of hearing protection.

• The active earpieces will be used in conjunction with the CCA ear cups during launch

and other high noise events and can be removed for other suited operations.

• The active earpieces alone nearly provide the required level of hearing protection.

Noise Canceling

Microphones

Active In-Canal Earpieces

Sou

rce:

O. S

ands

, NA

SA

GR

C

Sou

rce:

O. S

ands

, NA

SA

GR

C

Ear Cups

Page 10: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

10 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

CCA Systems: Pros

• High outbound speech intelligibility and quality, SNR near optimum

Use close-talking microphones

A high degree of acoustic isolation between the in-suit noise and the suit subject’s vocalizations

A high degree of acoustic isolation between the inbound and outbound signals

The human body does NOT transmit vibration-borne noise

• Provide very good hearing protection.

• High outbound speech intelligibility and quality, SNR near optimum

Use close-talking microphones

A high degree of acoustic isolation between the in-suit noise and the suit subject’s vocalizations

A high degree of acoustic isolation between the inbound and outbound signals

The human body does NOT transmit vibration-borne noise

• Provide very good hearing protection.

Page 11: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

11 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

• The microphones need to be close to the mouth of a suited subject.

• A number of recognized logistical issues and inconveniences:

Cannot adjust the cap and the microphone booms during EVA operations, which can last from 4 to 8 hours

The close-talking microphones interfere with the suited subject’s eating and drinking, and are susceptible to contamination.

The communication cap needs to fit well. Caps in a variety of different sizes need to be built and maintained, e.g., 5 sizes for EMU caps.

Wire fatigue for the microphone booms

• These problems cannot be resolved with incremental improvements to the basic

design of the CCA systems.

• The microphones need to be close to the mouth of a suited subject.

• A number of recognized logistical issues and inconveniences:

Cannot adjust the cap and the microphone booms during EVA operations, which can last from 4 to 8 hours

The close-talking microphones interfere with the suited subject’s eating and drinking, and are susceptible to contamination.

The communication cap needs to fit well. Caps in a variety of different sizes need to be built and maintained, e.g., 5 sizes for EMU caps.

Wire fatigue for the microphone booms

• These problems cannot be resolved with incremental improvements to the basic

design of the CCA systems.

CCA Systems: Cons

Page 12: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

12 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Stakeholder Interviews

• The CCA ear cups produce pressure points that cause discomfort.

• Microphone arrays and helmet speakers are suggested to be used.

• Suit subject comfort should be maximized as much as possible, given that other constraints can be met (relaxed and traded off):

Clear two-way voice communications

Hearing protection from the fan noise in the life support system ventilation loop

Properly containing and managing hair and sweat inside the helmet

Adequate SNR for the potential use of automatic speech recognition for the suit’s information system

• The CCA ear cups produce pressure points that cause discomfort.

• Microphone arrays and helmet speakers are suggested to be used.

• Suit subject comfort should be maximized as much as possible, given that other constraints can be met (relaxed and traded off):

Clear two-way voice communications

Hearing protection from the fan noise in the life support system ventilation loop

Properly containing and managing hair and sweat inside the helmet

Adequate SNR for the potential use of automatic speech recognition for the suit’s information system

Page 13: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

13 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Two Alternative Architectural Options for In-Suit Audio

1. Integrated Audio (IA):

Instead of being

housed in a separate

subassembly, both the

microphones and the

speakers are

integrated into the

suit/helmet.

2. Hybrid Approach:

Employs the inbound

portion of a CCA

system with the

outbound portion of an

IA system.

1. Integrated Audio (IA):

Instead of being

housed in a separate

subassembly, both the

microphones and the

speakers are

integrated into the

suit/helmet.

2. Hybrid Approach:

Employs the inbound

portion of a CCA

system with the

outbound portion of an

IA system.

Helmet Speaker

Page 14: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

14 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Section 2

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

Page 15: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

15 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Noise from Outside the Spacesuit

• During launch, entry descent, and landing:

Impulse noise < 140 dBSPL, Hazard noise < 105 dBA

• On orbit:

Impulse noise: < 140 dBSPL waking hours and < 83 dBSPL sleeping

Limits on continuous on-orbit noise levels by frequency:

• Remark: During EVA operations, ambient noise is at most a minor problem.

• During launch, entry descent, and landing:

Impulse noise < 140 dBSPL, Hazard noise < 105 dBA

• On orbit:

Impulse noise: < 140 dBSPL waking hours and < 83 dBSPL sleeping

Limits on continuous on-orbit noise levels by frequency:

• Remark: During EVA operations, ambient noise is at most a minor problem.

Band Center Frequency (Hz) 63 125 250 500 1k 2k 4k 8k 16k

Sound Pressure Level (dB) 72 65 60 56 53 51 50 48 48

SPL (dB) 85 – 95 75 – 85 65 – 75 55 – 65

PerceptionVery High Noise: speech almost

impossible to hear

High Noise: speech is difficult to hear

Medium Noise: Must Raise Voice to

be Heard

Low Noise: speech is easy

to hear

Typical

Environments

Construction SiteLoud Machine ShopNoisy Manufacturing

Assembly LineCrowded Bus/Transit

Waiting AreaVery Noisy Restaurant/Bar

Department StoreBand/Public Area

Supermarket

Doctor’s OfficeHospital

Hotel Lobby

Page 16: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

16 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Structure-Borne Noise Inside the Spacesuit

• Four noise sources (Begault & Hieronymus 2007):

1. Airflow and air inlet hissing noise, as well as fan/pump noise due to required air supply and circulation

2. Arm, leg, and hip bearing noise

3. Suit-impact noise, e.g., footfall

4. Swishing-like noise due to air movement caused by walking (since the suits are closed pressure environments)

• Four noise sources (Begault & Hieronymus 2007):

1. Airflow and air inlet hissing noise, as well as fan/pump noise due to required air supply and circulation

2. Arm, leg, and hip bearing noise

3. Suit-impact noise, e.g., footfall

4. Swishing-like noise due to air movement caused by walking (since the suits are closed pressure environments)

• For CCA systems, since the suit subject’s body does not transmit bearing and impact noise, only airflow-related noise needs to be controlled.

• For Integrated Audio (IA) systems, microphones are mounted directly on the suit structure and vibration noise is loud.

• For CCA systems, since the suit subject’s body does not transmit bearing and impact noise, only airflow-related noise needs to be controlled.

• For Integrated Audio (IA) systems, microphones are mounted directly on the suit structure and vibration noise is loud.

Page 17: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

17 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Acoustic Challenges

• Complicated noise field:

Temporal domain: Has both stationary and non-stationary noise

Spectral domain: Inherently wideband

Spatial domain: Near field; Possibly either directional or dispersive

• Highly reverberant enclosure:

The helmet is made of highly reflective materials.

Strong reverberation dramatically reduces the intelligibility of speech uttered by the suit subject and degrades the performance of an automatic speech recognizer.

Strong reverberation leads to a more dispersive noise field, which makes beamforming less effective.

Page 18: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

18 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Section 3

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

Page 19: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

19 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

4

3

2

1

Proposed Noise Control Scheme for IA/Hybrid Systems

5

Adaptive Noise

Cancellation

Adaptive Noise

Cancellation

Beamforming

MultichannelNoise

Reduction

Acoustic Source Localization

Acoustic Source Localization

Head Position Calibration

Head Position CalibrationHead Motion

Tracker

Single Channel

Noise Reduction

Single Channel

Noise Reduction

Outbound Speech

Mouth range and incident angle with respect to the microphone array

Noise Reference

Microphone Array

Page 20: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

20 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Current Research Focus

4

3

2

1 Beamforming

MultichannelNoise

Reduction

Single Channel

Noise Reduction

Single Channel

Noise Reduction

Outbound Speech

Microphone Array

Page 21: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

21 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Beamforming: Far-Field vs. Near-Field

. . .d

θ

...hN h2 h1...

Σ

Y(f, ψ, rs)

XN(f) X2(f) X1(f)

ψ

Far-Field NoisePlane Waves

…V(f, ψ)

S(f, rs)

Near-Field Sound Source

rs

12N

. . .

d

ψ

(N-1

)·d·co

s(ψ)

Plane Waves

θ

...hN h2 h1...

Σ

Y(f, ψ, θ)

XN(f) X2(f) X1(f)

S(f, θ)

V(f, ψ)

Far-Field NoiseFar-Field Sound Source

of Interest

12N

Page 22: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

22 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Fixed Beamformer vs. Adaptive Beamformer

Microphone Array BeamformersMicrophone Array Beamformers

Fixed BeamformersFixed Beamformers Adaptive BeamformersAdaptive Beamformers

Delay-and-SumDelay-and-Sum Filter-and-SumFilter-and-Sum MVDR (Capon)MVDR (Capon) LCMV (Frost)/GSCLCMV (Frost)/GSC

Noise Field?Stationary, Known before the design Time Varying, Unknown

Isotropic noise generally assumed

Reverberation?Not Concerned Significant

Delay-and-Sum

• Simple

• Non-uniform directional responses over a wide spectrum of frequencies

Delay-and-Sum

• Simple

• Non-uniform directional responses over a wide spectrum of frequencies

Filter-and-Sum

• Complicated

• Uniform directional responses over a wide spectrum of frequencies: good for wideband signals, like speech

Filter-and-Sum

• Complicated

• Uniform directional responses over a wide spectrum of frequencies: good for wideband signals, like speech

MVDR (Capon)

• Only the TDOAs of the interested speech source need to be known – simple requirements.

• Reverberation causes the signal cancellation problem.

• Time-domain or frequency-domain

MVDR (Capon)

• Only the TDOAs of the interested speech source need to be known – simple requirements.

• Reverberation causes the signal cancellation problem.

• Time-domain or frequency-domain

LCMV (Frost)/GSC

• The impulse responses (IRs) from the source to the microphones have to be known or estimated.

• Errors in the IRs lead to the signal cancellation problem.

LCMV (Frost)/GSC

• The impulse responses (IRs) from the source to the microphones have to be known or estimated.

• Errors in the IRs lead to the signal cancellation problem.

Page 23: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

23 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Comments on Traditional Microphone Array Beamforming

• For incoherent noise sources, the gain in SNR is low if the number of microphones is small.

• For coherent noise sources whose directions are different from that of the speech source, a theoretically optimal gain in SNR can be high but is difficult to obtain due to a number of practical limitations:

Unavailability of precise a priori knowledge of the acoustic impulse responses from the speech sources to the microphones.

Inconsistent responses of the microphones across the array.

• For coherent noise sources that are in the same direction as the speech source, beamforming (as a spatial filter) is ineffective.

• For incoherent noise sources, the gain in SNR is low if the number of microphones is small.

• For coherent noise sources whose directions are different from that of the speech source, a theoretically optimal gain in SNR can be high but is difficult to obtain due to a number of practical limitations:

Unavailability of precise a priori knowledge of the acoustic impulse responses from the speech sources to the microphones.

Inconsistent responses of the microphones across the array.

• For coherent noise sources that are in the same direction as the speech source, beamforming (as a spatial filter) is ineffective.

Page 24: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

24 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Multichannel Noise Reduction (MCNR)

x1,s(k)Only Denoising

. . ....

MCNRMCNR

xN(k) x2(k) x1(k)

s(k)

12N

v(k)

. . .gN g2 g1

• Beamformer: Spatial Filtering

• Array Setup: Calibration is necessary – possibly time/effort consuming

• Beamformer: Spatial Filtering

• Array Setup: Calibration is necessary – possibly time/effort consuming

• MCNR: Statistical Filtering

• Array Setup: No need to strictly demand a specific array geometry/pattern

• MCNR: Statistical Filtering

• Array Setup: No need to strictly demand a specific array geometry/pattern

• A conceptual comparison of beamforming and MCNR:• A conceptual comparison of beamforming and MCNR:

s(k)

. . . d

...BeamformingBeamforming

xN(k) x2(k) x1(k)

s(k)

Speech Sourceof Interest

12N

Noisev(k)

. . .Impulse ResponsesImpulse Responses

gN g2 g1

Dereverberation and Denoising

Knowledge related to the source position or gn

Knowledge related to the source position or gn

• Signal Model:• Signal Model:

Page 25: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

25 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Frequency-Domain MVDR Filter for MCNR

• The problem formulation:

• The MVDR filter:

• A more practical implementation:

where

• Similar to traditional single-channel noise reduction methods, the noise PSD matrix is estimated during silent periods and the signal PSD matrix is estimated during speech periods.

• The problem formulation:

• The MVDR filter:

• A more practical implementation:

where

• Similar to traditional single-channel noise reduction methods, the noise PSD matrix is estimated during silent periods and the signal PSD matrix is estimated during speech periods.

Page 26: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

26 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Comparison of the MVDR Filters for Beamforming and MCNR

• Note: In the implementation of the MVDR-MCNR, the channel responses do not need to be known.

• Note: In the implementation of the MVDR-MCNR, the channel responses do not need to be known.

• The acoustic impulse responses can at best be estimated up to a scale:• The acoustic impulse responses can at best be estimated up to a scale:

wherewhere denotes the true response vector.denotes the true response vector.

Leads to speech distortion.Leads to speech distortion.

• MVDR for MCNR:• MVDR for MCNR:• MVDR for Beamforming (BF):• MVDR for Beamforming (BF):

Page 27: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

27 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Distortionless Multichannel Wiener Filter for MCNR

• Use what we called the spatial prediction:

• Formulate the following optimization problem:

where

• The distortionless multichannel Wiener (DW) filter for MCNR:

• The optimal Wiener solution for the non-causal spatial prediction filters:

where So,

• It was found that

• Use what we called the spatial prediction:

• Formulate the following optimization problem:

where

• The distortionless multichannel Wiener (DW) filter for MCNR:

• The optimal Wiener solution for the non-causal spatial prediction filters:

where So,

• It was found that

Page 28: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

28 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Single-Channel Noise Reduction (SCNR) for Post-Filtering

• Beamforming: The Wiener filter (the optimal solution in the MMSE sense) can be factorized as

• Beamforming: The Wiener filter (the optimal solution in the MMSE sense) can be factorized as

MVDR BeamformerMVDR Beamformer Wiener Filter for SCNRWiener Filter for SCNR

• MCNR: Again, the Wiener filter can be factorized as• MCNR: Again, the Wiener filter can be factorized as

Note: For a complete and detailed development of this factorization, please refer to Eq. (6.117) of the following book. J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing, Berlin, Germany: Springer, 2008.Note: For a complete and detailed development of this factorization, please refer to Eq. (6.117) of the following book. J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing, Berlin, Germany: Springer, 2008.

MVDR for MCNRMVDR for MCNR Wiener Filter for SCNRWiener Filter for SCNR

Note: For a complete and detailed development of this factorization, please refer to Eq. (3.19) of the following book. M. Brandstein and D. Ward, eds, Microphone Arrays: Signal Processing Techniques and Applications, Berlin, Germany:

Sprinter, 2001.

Note: For a complete and detailed development of this factorization, please refer to Eq. (3.19) of the following book. M. Brandstein and D. Ward, eds, Microphone Arrays: Signal Processing Techniques and Applications, Berlin, Germany:

Sprinter, 2001.

Page 29: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

29 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Single-Channel Noise Reduction (SCNR)

• The signal model:

• SCNR filter:

• Error signal:

• MSE cost function:

• The Wiener filter:

where

and

• Other SCNR methods: Parametric Wiener filter, Tradeoff filter.

• The signal model:

• SCNR filter:

• Error signal:

• MSE cost function:

• The Wiener filter:

where

and

• Other SCNR methods: Parametric Wiener filter, Tradeoff filter.

• A well-known feature: Noise reduction is achieved at the cost of adding speech distortion.

• A well-known feature: Noise reduction is achieved at the cost of adding speech distortion.

Page 30: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

30 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

New Idea for SCNR

• A second-order complex circular random variable (CCRV) has:

which implies that and its conjugate are uncorrelated.

• In general, speech is not a second-order CCRV:

• But noise is a second-order CCRV if stationary, and not otherwise.

• A second-order complex circular random variable (CCRV) has:

which implies that and its conjugate are uncorrelated.

• In general, speech is not a second-order CCRV:

• But noise is a second-order CCRV if stationary, and not otherwise.

• Examine

• This is similar to the signal model of a two-element microphone array. So there is a chance to reduce noise without adding any speech distortion.

• Examine

• This is similar to the signal model of a two-element microphone array. So there is a chance to reduce noise without adding any speech distortion.

Correlated but not completely coherent Uncorrelated or coherent

Page 31: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

31 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Widely Linear Wiener Filter

• New filter for SCNR:

• Error signal:

• Widely linear MSE:

• Then the widely linear Wiener filter or MVDR type of filters can be developed.

• New filter for SCNR:

• Error signal:

• Widely linear MSE:

• Then the widely linear Wiener filter or MVDR type of filters can be developed.

Page 32: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

32 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Section 4

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

Page 33: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

33 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Computational Platform/Technology Selection

Three platforms under consideration:

• ASIC

• DSP

• FPGA

Trade-off among performance, power consumption, size, and costs

Three platforms under consideration:

• ASIC

• DSP

• FPGA

Trade-off among performance, power consumption, size, and costs

Four competing factors:

• The count of transistors employed

• The number of clock cycles required

• The time taken to develop an application

• Nonrecurring engineering (NRE) costs

Four competing factors:

• The count of transistors employed

• The number of clock cycles required

• The time taken to develop an application

• Nonrecurring engineering (NRE) costs

ASIC• Low numbers of transistors

and clock cycles

• Long development time and high NRE costs

• Effective in performance, power, and size, but not in cost

ASIC• Low numbers of transistors

and clock cycles

• Long development time and high NRE costs

• Effective in performance, power, and size, but not in cost

DSP• Low development and

NRE costs

• Low power consumption

• More efforts to convert the design to ASICs

DSP• Low development and

NRE costs

• Low power consumption

• More efforts to convert the design to ASICs

FPGA• Not suited to processing sequential

conditional data flow, but efficient in concurrent applications

• Support faster I/O than DSPs

• One step closer to ASIC than DSP

• High development cost due to performance optimization

FPGA• Not suited to processing sequential

conditional data flow, but efficient in concurrent applications

• Support faster I/O than DSPs

• One step closer to ASIC than DSP

• High development cost due to performance optimization

Page 34: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

34 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Mic. Powering

Circuit

83

2

1GND

HOT

COLD3

2

1

System Block Diagram

DB25Female

XLRFemale

XLRMaleMIC CAPSULE

DB25Male

FPGA Board

Mic. Powering

Circuit

13

2

1GND

HOT

COLD3

2

1

8-ch 24-bit

48kHz ADC

8-ch 24-bit

48kHz ADC

Mic. Preamps

G

G

G

G

G

G

G

G

Jumpers(for Gain Control)

Altera

FPGA

Altera

FPGA

JTAG (Male)

SDRAMSDRAM SDRAMSDRAM

. . . . . .

.

.

.

Digital Output Interface(USB 2.0)

. . .

Power Mgmt ICPower

Mgmt IC

PowerJack

An

alo

g I

np

ut

. . .

Fla

shF

lash.

.

.

Mic. Powering

Circuit

23

2

1GND

HOT

COLD3

2

1

Mic. Powering

Circuit

33

2

1GND

HOT

COLD3

2

1

Mic. Powering

Circuit

43

2

1GND

HOT

COLD3

2

1

Mic. Powering

Circuit

53

2

1GND

HOT

COLD3

2

1

Mic. Powering

Circuit

63

2

1GND

HOT

COLD3

2

1

Mic. Powering

Circuit

73

2

1GND

HOT

COLD3

2

1

Page 35: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

35 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

FPGA Board Block Diagram

OPA1632 (1)OPA1632 (1)

OPA1632 (2)OPA1632 (2)

OPA1632 (8)OPA1632 (8)

ADS1278ADS1278

EPCS16EPCS16

Altera Cyclone III

EP3C55F484C8

FPGA

Altera Cyclone III

EP3C55F484C8

FPGA

16 MB SDRAM (×32)

16 MB SDRAM (×32)

16 MB SDRAM (×32)

16 MB SDRAM (×32)

16 MB Flash (×16)

16 MB Flash (×16)

50 MHz XTAL

24.576 MHz XTAL

USB 2.0 (High Speed) User

LED/IOs

3.3 V

Page 36: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

36 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Prototype FPGA Board: the Top View

Phantom Power Feeding

Mic. Pream Gain Jumpers

OPA1632

REF1004 ADS1278

User LEDsEPCS16

S

User I/Os

JTAG

FT2232H USB 2.0 Jack

12 MHz Crystal

GND

TPS65053

Flash

DC Power Jack

Power LED

SDRAMsCyclone III FPGA

Analog Power DC 9V

Analog Power DC 5V

DB25

174.8 mm × 101 mm174.8 mm × 101 mm

Page 37: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

37 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Prototype FPGA Board: the Bottom View

OPA1632

50 MHz Clock Oscillator (OSC2)

24.576 MHz Clock Oscillator (OSC1)

Page 38: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

38 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

FPGA System Development Flow Adopted in the Project

System on Programmable Chip (SoPC) + C/C++ Programming:

1) Use SoPC Builder to construct a soft-core NIOS II processor embedded on the Altera FPGA

2) Develop software/DSP systems in C/C++ on the NIOS II processor

System on Programmable Chip (SoPC) + C/C++ Programming:

1) Use SoPC Builder to construct a soft-core NIOS II processor embedded on the Altera FPGA

2) Develop software/DSP systems in C/C++ on the NIOS II processor

• Advantages:

Short development cycle/time

Low cost

High reliability

Reusability of intellectual property

• Advantages:

Short development cycle/time

Low cost

High reliability

Reusability of intellectual property

• Drawbacks:

Poor efficiency and low performance:

Efficiency can be improved by identifying those time-consuming functions (e.g., FFT and IFFT) and accelerating them with the tool of C2H (C-to-Hardware)

• Drawbacks:

Poor efficiency and low performance:

Efficiency can be improved by identifying those time-consuming functions (e.g., FFT and IFFT) and accelerating them with the tool of C2H (C-to-Hardware)

CPU (NIOS II)

ROM RAM

I/O

UART DSP

Page 39: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

39 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

a

b

d

c a

b

d

c a

b

d

c a

b

d

c a

b

d

c a

b

d

c a

b

d

c

Analog Device ADMP402 MEMS Microphones: 2.5 mm × 3.35 mm

1 72 3 4 5 6

5 mm 5 mm

5 m

m5

mm

20 mm 20 mm7 Subarrays Pin 18

Pin 1

XG-MPC-MEMS

MEMS Microphone Array

123456789101112131415161718

Samsung 18-pin Connector

Page 40: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

40 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

MEMS Microphone Array Box

Pin 1

Pin 18

Samsung 18-pin Connector

Wevoice MEMS Microphone Array

76

54

32

135 mm

12.5 mm

155 mm

Page 41: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

41 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Section 5

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

Page 42: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

42 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

FPGA Program Flowchart

data in & preprocessing

MCNR+SCNR

4-ch FFT

1-ch IFFT

overlap add

USB trans.

data in & preprocessing

MCNR+SCNR

4-ch FFT

1-ch IFFT

overlap add

USB trans.

time (ms)

t t+4 t+81 time frame

Nios II Soft Core

FFT/IFFT Processor

To USB To USBFrom ADCFrom ADC

FPGA

Processing delay < 8 ms

. . . . . .

Page 43: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

43 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

IA System Windows Host Software

• Programmed with Microsoft Visual C++

• Direct Sound is used to play back audio (speech).

• Programmed with Microsoft Visual C++

• Direct Sound is used to play back audio (speech).

Splash window of the programSplash window of the program

Page 44: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

44 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

IA System Windows Host GUI: Multitrack View

Page 45: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

45 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

IA System Windows Host GUI: Single-Track View

Page 46: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

46 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

IA System Windows Host GUI: Playing Back

Page 47: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

47 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Section 6

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

Page 48: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

48 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

The Portable, Real-Time Demo System

FPGA BoardPower Supply: Linear DC 12-20V/1A

Suited Subject

DB25Connectors

PC

USB 2.0 Cable

MEMS Microphone Array

Audio Cable

Page 49: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

49 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Section 7

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits

Page 50: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

50 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

What is and Why do we want Immersive Communication?

Telecommunication helps people collaborate and share information by cutting across the following 3 separations/constraints:

Long distance

Real time

Physical boundaries

Modern telecommunication technologies are successful so far in transcending the first two constraints: i.e., the long-distance and real-time constraints.

Immersive communication offers an feeling of being together and sharing a common environment during collaboration.

Immersive communication targets at breaking the physical boundaries, which is the “last mile” problem in communication.

Telecommunication helps people collaborate and share information by cutting across the following 3 separations/constraints:

Long distance

Real time

Physical boundaries

Modern telecommunication technologies are successful so far in transcending the first two constraints: i.e., the long-distance and real-time constraints.

Immersive communication offers an feeling of being together and sharing a common environment during collaboration.

Immersive communication targets at breaking the physical boundaries, which is the “last mile” problem in communication.

Page 51: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

51 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

What need to be solved for immersive communication systems?

Single-Channel Acoustic Echo CancellationSingle-Channel Acoustic Echo Cancellation

Page 52: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

52 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

What need to be solved for immersive communication systems?

Multichannel Acoustic Echo CancellationMultichannel Acoustic Echo Cancellation

Page 53: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

53 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Synthesized Stereo

Audio Mixing

System

What need to be solved for immersive communication systems?

Page 54: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

54 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

What need to be solved for immersive communication systems?

BeamformingBeamforming Blind Source SeparationBlind Source Separation

Page 55: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

55 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

What need to be solved for immersive communication systems?

Acoustic Source

Localization and

Tracking

Acoustic Source

Localization and

Tracking

Page 56: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

56 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

What need to be solved for immersive communication systems?

Stereophony System

for Spatial Sound

Reproduction

Page 57: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

57 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

What need to be solved for immersive communication systems?

Wave Field

Synthesis

Page 58: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

58 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Why Immersive Voice Communication in Spacesuits?

Immersive voice communication exploits human’s binaural hearing.

Provides enhanced situational awareness for a suited crewmember:

Can improve the productivity of collaboration among the crewmembers

Can produce potential safety benefits

Crew comfort can be optimized.

Immersive voice communication exploits human’s binaural hearing.

Provides enhanced situational awareness for a suited crewmember:

Can improve the productivity of collaboration among the crewmembers

Can produce potential safety benefits

Crew comfort can be optimized.

Page 59: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

59 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

What Problems Need to be Solved?

• Stereo/Multichannel Acoustic Echo Cancellation (MCAEC)

• Integration of MCAEC and MCNR

• Three Dimensional (3D) Audio

• Stereo/Multichannel Acoustic Echo Cancellation (MCAEC)

• Integration of MCAEC and MCNR

• Three Dimensional (3D) Audio

Page 60: Noise and Echo Control for Immersive Voice Communication  in Spacesuits

60 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

Conclusions

• While it has been more than 40 years since Neil Armstrong landed on the Moon, the astronauts are still using the communication carrier assembly (CCA) based audio system for voice communication in spacesuits.

• The new spacesuit design is going to take advantage of the most recent advances in multichannel acoustic and speech signal processing for echo and noise control and meanwhile with significantly improved crew comfort and ease of use.

Noise reduction with microphone arrays

Multichannel echo cancellation

Integrated echo and noise control

3D audio

• We explained the difference between the traditional beamforming method and what we called the multichannel noise reduction approach.

• We presented an intuitive interpretation of the widely linear Wiener filter for single-channel noise reduction.

• We described a new application of immersive communication in space exploration, ancillary to its mainstream use in commercial telecommunication systems.

• While it has been more than 40 years since Neil Armstrong landed on the Moon, the astronauts are still using the communication carrier assembly (CCA) based audio system for voice communication in spacesuits.

• The new spacesuit design is going to take advantage of the most recent advances in multichannel acoustic and speech signal processing for echo and noise control and meanwhile with significantly improved crew comfort and ease of use.

Noise reduction with microphone arrays

Multichannel echo cancellation

Integrated echo and noise control

3D audio

• We explained the difference between the traditional beamforming method and what we called the multichannel noise reduction approach.

• We presented an intuitive interpretation of the widely linear Wiener filter for single-channel noise reduction.

• We described a new application of immersive communication in space exploration, ancillary to its mainstream use in commercial telecommunication systems.


Recommended