+ All Categories
Home > Documents > State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech...

State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech...

Date post: 18-Sep-2018
Category:
Upload: hoangkhue
View: 218 times
Download: 0 times
Share this document with a friend
26
1 State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak AG, Stäfa, CH ©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 2 Content Speech intelligibility in complex listening environments for hearing impaired persons Noise reduction technologies in hearing instruments De-reverberation Single microphone technology Multi-microphone technology FM systems Results Challenges
Transcript
Page 1: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

1

State of art andChallenges in Improving Speech Intelligibility in

Hearing Impaired People

Stefan Launer, Lyon, January 2011

Phonak AG, Stäfa, CH

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 2

Content

� Speech intelligibility in complex listening environments for hearing impaired persons

� Noise reduction technologies in hearing instruments

� De-reverberation

� Single microphone technology

� Multi-microphone technology

� FM systems

� Results

� Challenges

Page 2: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

2

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 3

Speech Intelligibility in Noise??? � Speech Intelligibility in Complex Listening Conditions!!

� Different types of interfering sources

� Different spatial arrangements of sources and interferers

� Dynamic…

� Room acoustics

� Reverberation

� Distance

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 4

Speech Intelligibility in Noise??? � Speech Intelligibility in Complex Listening Conditions!!

� Test methodology

� Speech tests: short sentences, words, phonemes- target from front, static

� White noise… from the back

� Anechoic environment

� Lab / real life results

� Speech intelligibility

� Listening effort

Page 3: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

3

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 5

Killion 1997

Hearing Loss dB (3FA)Hearing Loss dB (3FA)Hearing Loss dB (3FA)Hearing Loss dB (3FA)

SNR dB

SNR dB

SNR dB

SNR dB

20 30 40 50 60 70 80 90

0

5

10

15

20

Mild hearing loss

Moderate hearing loss

Severe hearing loss

Speech Intelligibility in Noise

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 6

Physical structure of interfering signal has a strong impact on speech intelligibility

Introducing ....

� spectral dips: SH 3 - 4 dB SRT NH: 9 - 15 dB

� temporal dips: SH 1 - 2 dB SRT NH: 6 - 7 dB

� combination of both: SH 4 - 5 dB SRT NH: 15 - 20 dB

� ... improves speech intelligbility a lot for normal-hearing subjects, much less so for hearing impaired subjects !

Peters, Moore and Baer 1998, JASA

Page 4: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

4

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 7

Speech Intelligibility in Multi-talker Environment

� Speech intelligibility as a function of interfering talkers

Fig. 2,Bronhorst and Plomp, JASA 1992

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 8

Spatial Release from Masking – Anechoic Chamber

Beutelmann & Brand, JASA 2006

NH HI

10 dB!

Page 5: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

5

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 9

Spatial Release from Masking - Office

� Spatial release reduced byreverberation

Beutelmann & Brand, JASA 2006

NH

HI

4 dB!

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 10

Spatial Release from Masking - Cafeteria

Beutelmann & Brand, JASA 2006

NH HI6-7 dB!

Page 6: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

6

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 11

Speech intelligibility in reverberant environments

0000

10101010

20202020

30303030

40404040

50505050

60606060

70707070

80808080

90909090

100100100100

Sound suiteSound suiteSound suiteSound suite

Correct

Correct

Correct

Correct

%% %%

T = 0.54T = 0.54T = 0.54T = 0.54 T = 1.55T = 1.55T = 1.55T = 1.55

Reverberation TimeReverberation TimeReverberation TimeReverberation Time

normal

mild

Moderate / severe

Harris & Swenson, Audiology 1990, p. 314-321

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 12

How to mix a Speech in Noise Cocktail

Speech in

noise cocktail

directional microphones

noise

canceling

Objectives for a hearing instrument:

� Speech intelligibility improvement!!!!

� Ease of listening, listening effort, listening comfort

Page 7: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

7

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 13

Noise Reduction Using a Single Microphone

� Single Microphone Noise-Cancellers: in principal estimate the noise and subtract it from the noisy signal.

Adaptive Filter:H = 1 - N* / (S + N)

Speech Detection

(S + N)

Noise Estimation

N*

(S + N) - N* ≈ S

� Statistical estimation, amplitude modulation, noise detection in speech pauses

� Use a single information source to separate two signals

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 14

Reverberation Canceller – Reduces the smearing effects by de-blurring the speech signal

Signal

Time

Time span of early

reflections

Time span of disturbing reflections

EchoBlockEchoBlockEchoBlockEchoBlock

Level

Page 8: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

8

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 15

Single Microphone Noise Reduction - Summary

� This technique performs well eliminating stationary noises like a fan or in a car, etc.

� Reverberation: very reverberant rooms

� Speech like noises can’t be suppressed without degrading speech quality at the same time.

� ... ease of listening: improving listening comfort

� reduction of perceived noisiness

� less annoyance

� Improvement of speech intelligibility ???

� Sound quality is a trade off…

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 16

Delay & Sum - Technique

� The acoustical signal is picked up at two different locations by the front and the back microphones

� The signal from the back is delayed

� The signals from both microphones are summed

� Depending on delay - different directions are attenuated

f

b

d

Target direction

Delay= d/c

+

Page 9: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

9

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 17

Digital Adaptive Directional Microphones

� Adaptive: minimize output energy of the two microphones AD-

Converter

AD-Converter

Back

Front

SpatialProcessor

α

The spatial weighting factor (a) is continuously ad apted, the Directivity Index hereby optimized .

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 18

Digital Adaptive Directional Microphones

� Amplify sounds from front

� Adaptively attenuates strongest noise source

Page 10: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

10

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 19

Frequency Specific BeamformingDirectivity in each frequency band

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 20

Directional Microphones: Potential and Limitations

� Significant speech intelligibility benefit compared to omnidirectional systems in complex listening conditions

� from side & asymmetric

� diffuse

� moving noises

� reverberant environments & larger distances

� Lab results: 3-6 dB improvements

Page 11: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

11

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 21

Directional Microphones: Potential and Limitations

� Positioning on head

� Microphone mismatch, ageing etc

� More than two microphones

� Noise floor

� Narrow beam pattern acceptable?

� Size constraint: low frequency roll-off

� Computational complexity

f

b

d

Target direction

Delay= d/c

+

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 22

Directivity Index for Different Products Styles and Placements

� BTE

0

5

DI

Page 12: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

12

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 23

Factors causing BF mismatch

The beamformer performance in our current products can be limited

due to level and phase mismatch caused by the following factors:

time invariant time variant

Microphone production mismatch Microphone ageing

HI assembly HI repairing

Clean W&W variability W&W pollution

Customer individual head/pinna shape

Non-idealities of current adaptive level matching block

Device geometry:

ITEs and microBTEs have unfavorable mic positions

Customer HI positioning variance

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 24

Effects of Microphone/BF mismatch

� microphone phase deviation

� → rotated null direction

� microphone magnitude deviation

� → reduced suppression

target

blocking

Page 13: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

13

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 25

Binaural Directional Microphones

Maximum SNR improvement: 3 dB

Beamformer Beamformer

wireless transmission

∑ ⋅i

ii Xw ∑ ⋅i

ii Xw

Improving directivity by linear combination of mona ural directional microphone outputs

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 26

Test setup

� Subjects

� 20 adults

� Moderate - moderately-severe hearing loss

� Exélia Art and Ambra microP BTE

� Algorithms

� Excelia

� Ambra UltraZoom (monaural)

� Ambra StereoZoom (binaural)

� Test setup

� OLSA: speech intelligibility in noise

� Listening effort scaling

� Paired comparison

Page 14: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

14

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 27

Binaural Beamforming

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 28

OLSA

A) B)OLSA, 60° angle

-14

-12

-10

-8

-6

-4

-2

0

SR

T 5

0% in

dB

SN

R

Exélia Art VoiceZoomAmbra UltraZoomAmbra StereoZoom

OLSA, 45° angle

-14

-12

-10

-8

-6

-4

-2

0

Page 15: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

15

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 29

Paired Comparison – Subjective Speech Intelligibility

Mit welchem Hörgerät verstehen Sie besser?

0

10

20

30

40

50

60

70

45° Winkel 60° WinkelStörgeräusch

An

zah

l V

erg

leic

he

Ambra UZ Ambra SZ Exelia VZ

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 30

Paired Comparison – Subjective Listening Effort

Mit welchem Hörgerät verstehen Sie leichter?

0

10

20

30

40

50

60

70

80

45° Winkel 60° WinkelStörgeräusch

An

zah

l V

erg

leic

he

Ambra UZ Ambra SZ Exelia VZ

Page 16: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

16

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 31

User-Steered directionality

� Traditional beamforming systems focus only to the front

� Speech signals do not always come from the front and facing the speaker is not always possible

� Car, restaurants, small groups

� ZoomControl, accessible through myPilot,allows Exélia wearers to select in which direction to focus hearing

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 32

Listen to the side: User-Steered directionality

� Uses four-microphone network of full bandwidth binaural instruments

� Broadband audio data transfer between devices focuses hearing in one specific direction, while suppressing all signals in other directions

Page 17: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

17

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 33

Fal

l_La

unch

_201

0_A

mbr

a_G

B_P

age

33

User-Steered directionality

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 34

Fal

l_La

unch

_201

0_A

mbr

a_G

B_P

age

34

User-Steered directionality

-9

-7

-5

-3

-1

1

3

5

7

0° (front) 90° (left) 180° (back) 270° (right)

SNR (dB

SPL)

Without

Adaptive multichannel directionality

Steerable directionality

ExeliaArt P

Page 18: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

18

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 35

Subjective Evaluation – Listening Effort

Which setting needs the least listening effort to u nderstand well? (For

first time and experienced user (n=9))

88%

13%

78%

11% 11%0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Without ZoomControl VoiceZoom Omni

Male speech Female speech

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 36

Binaural noise reduction techniques

� Different types of algorithms

� Beam former: spatial information, timing difference

� Binaural Wiener Filter

� Blind source separation:statistical information estimating room transfer function

� Auditory processing schemes

Page 19: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

19

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 37

BWF: Speech Intelligibility Weighted GainS

peec

h In

telli

gibi

lity

Wei

ghte

d G

ain

Acoustic EnvironmentPhD Thesis van den Bogaert 2008

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 38

BWF: Speech Intelligibility Weighted Gain

Spe

ech

Inte

lligi

bilit

y W

eigh

ted

Gai

n

Acoustic EnvironmentPhD Thesis van den Bogaert 2008

Page 20: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

20

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 39

Binaural Beam Forming / Noise Reduction

� No stereo output signal => loss of spatial sensation / localization

� Artificially re-introduce that by “split-directionality”

� Mixing in part of the original signal at the output

� Narrower beam width

� How narrow should the beam be (head movement!)?

� Complex environments

� Dynamic -> target tracking, target identification

� Reverberation & distance

� Expected improvements: specific situation, no generic solution

� Single /few strong interfering source, frontal hemisphere

� Environments with little reverberation

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 40

Technical constraints

� Delay over the link

� Clock jtter

� Noise floor, signal degradation

� Microphone calibration (amplitude and phase)

Page 21: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

21

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 41

Earlevel FM

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 42

Modern FM Technology

� Dynamic Speech Extraction

� Automatic FM advantage: Adjusts the FM gain depending on the environmental noise level

� Surrounding Noise Compensation

� Voice Activity Detector

� Multi-talker networks: New team teaching concept using up to 10 transmitters

Page 22: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

22

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 43

SNR at ear level for different technologies

40 45 50 55 60 65 70 75 80 85

40

35

30

25

20

15

10

5

0

-5

-10

-15

-20

No FM

Traditional FM: fix FM Advantage

Adaptive FM Advantage

Surrounding Noise (dB SPL)

SN

R (

dB)

10 dB FM advantage: - Good environmental awareness- Audibility of the own voice- Compromise at high noise levels

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 44

45°

135°225°

315°

TX3

1 m

*

*

HINT Sentences

Correlated HINT Noise

1 meter (39“) Loudspeakers to center of head.7.6 cm (3“) Loudspeaker toTX3 Transmitter

Fieldstudy with 48 adults

Source: Valente, 2002

Page 23: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

23

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 45Speech Intelligibility Threshold

8.6 3.1 -0.8

-14.6 -18.9

-5

-20

-15

-10

-5

0

5

10

Mea

n R

TS

(dB

)

Una

ided

Om

ni

Dua

l

FM

-M

FM

-B

Nor

mal

Listening Conditions

5.9 2.3 1.8 6.5 3.2 2.1 (SD)

Source: Valente, 2002

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 46

Auditory Scene Analysis / Hearing Instrument Processing

Hearing Instrument ProcessingAuditory Processing

Target signal – assumption: in front…Attention control: Target signal identification and tracking, switching back and forth between objects, overcoming salient sources

Retrospective analysisDynamic aspects head / source movement …

A priori knowledge , “situational knowledge”- other sensory modalities- “world knowledge”, models of sources � fill in information…

Channel with limited information capacityChannel: full information capacity

Signal reconstruction & signal modification: amplification & attenuation / filtering -> “distortions”

No signal reconstruction!⇒ Perceptual attenuation, focus attention, suppression of neuronal activity

Delay constraint , real-time processingcomp. power constraint- limited signal analysis, spectro-temporal resolution

No delay constraint , no real-time processingHigher resolution signal analysisMuch higher computational power=> Stream segregation & source formation: works on several different time scales

Bottom upBottom up / top down

Page 24: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

24

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 47

Conclusion

� Hearing instruments offer several algorithms to improve speech intelligibility in complex listening environments

� Algorithms based mainly on

� Speech intelligibility in complex listening environments remains a huge challenge

� Reverberation and distance

� Dynamic target selection and tracking

� Technical limitations

� Realistic test setups and test procedures

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 48

Questions

� Speech intelligibility: how much is top-down driven versus bottom-up processing?

� Speech intelligibility: how fast is it really??

� How much information do we infer at the end of a „sentence“?

� Which cues (pitch, temporal fine structure, location,….) are the essential ones, does it depend on situation?

� How does the auditory system pick the relevant one??

� How do we achieve „perceptual constancy“ – voices in real life always sounds the same, (almost) independent of environment?

Page 25: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

25

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 49

Thank you…!!!

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 50

0000

10101010

20202020

30303030

40404040

50505050

60606060

70707070

80808080

90909090

100100100100

Sound suiteSound suiteSound suiteSound suite T = 0.54T = 0.54T = 0.54T = 0.54 T = 1.55T = 1.55T = 1.55T = 1.55

Reverberation timeReverberation timeReverberation timeReverberation time

Harris & Swenson, Audiology 1990, p. 314-321

normal

mild

moderate / severe

Speech Intelligibility %

Speech Intelligibility %

Speech Intelligibility %

Speech Intelligibility %

Speech Intelligibility in reverberant environments

Page 26: State of art and Challenges in Improving Speech … State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak

26

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 51

Binaural processing - audio delay

� Group delay:

� Is mainly determined by Radio bandwidth, ADC, CODEC, buffering (Error correction)

� Delay shall be deterministic and constant

� For binaural audio processing the link delay adds to the other signal processing delay ie. FFT block processing, ADC.

� Overall system delay should be less than 10 ms (Stone & Moore 2005, …)

� Audiosignals + control data:

� Some more delay for Gain control is acceptable (Hohmann 2009)

©Phonak Stefan Launer, Speech in Noise Workshop, January 2011 Page 52

Jitter – examples: 800 Hz pure tone

30°

0

10time / s

phase difference / deg

sfreq

phasediffT jitter µ30

360≤

⋅°=

Acoustic delay from head dimension: typ. 500 µs for ear distanceNormal hearing minimum audible angle: a few µsJitter should be smaller than 20 µs RMS

-> allows for binaural beamforming without significant localization errors


Recommended