Jan Reimes
HEAD acoustics GmbH
Sophia Antipolis, 2017-05-10
Loudness of transmitted speech signals for
SWB and FB applications
Challenges, auditory evaluation and proposals for
handset and hands-free scenarios
Loudness of Transmitted Speech SignalsJan Reimes 2
Introduction (1/2)
▪ Loudness of received speech signal - most simple but important
quality parameter of a communication device!
▪ Too loud: annoying, may cause hearing damage!
▪ Too quiet: impact on intelligibility (and other aspects of
conversational quality…)
▪ Several measurement standards provide requirements for
comfortable listening level
▪ Loudness == Level? Psycho-acoustics!
Loudness of Transmitted Speech SignalsJan Reimes 3
▪ For NB and WB terminals, so-called loudness ratings (LR) are used
to evaluate transmission characteristics (e.g. ITU-T P.79)
▪ Basic concept of LR: calculate attenuation (in dB) to achieve same
perceived loudness compared to intermediate reference system (IRS)
▪ Weighted sum of transfer function provides attenuation versus IRS
▪ Technical measure; no information about absolute loudness
▪ Addresses mainly linear distortions
▪ Method not (yet?) defined for SWB/FB applications
L/d
B
5
7
9
11
13
15
f/Hz300 500 2000 4000
Reference System 𝑹 𝒇L
/dB
5
7
9
11
13
15
f/Hz300 500 2000 4000
Introduction (2/2)
L/d
B
5
7
9
11
13
15
f/Hz300 500 2000 4000
Device under test 𝑯 𝒇
𝑳𝑹 ~
𝒇
𝒘 𝒇 ⋅ 𝑫 𝒇
𝑫(𝒇) = 𝑯 𝒇 − 𝑹 𝒇
Loudness of Transmitted Speech SignalsJan Reimes 4
▪ Standardization: ITU-T SG12 / Q5 launched new work item
“P.Loudness”
▪ Goal: evaluate and/or modify existing loudness models originated
from psycho-acoustic domain
▪ Several standardized models already exists:
▪ “Zwicker approach” (DIN 45631/A1, ISO 532-1)
▪ “Moore/Glasberg approach” (ANSI S3.4-2007, ISO 532-2)
▪ Current “release candidate model” for P.Loudness available
▪ Based on very basic auditory experiments, no real terminals
▪ Loudness model is based on stationary loudness (ANSI S3.4)
▪ Modifications are fitted to auditory results
▪ Two modes for handset/hands-free are required
▪ Not applicable on artificial head recordings (handset)
▪ No binaural aspects considered
Recent work on loudness
Loudness of Transmitted Speech SignalsJan Reimes 5
▪ Large test corpus based on binaural recordings of terminals (3G,
4G, VoIP) and realistic simulations (compression, codecs,
loudspeaker distortions)
▪ 8 German test sentences (ITU-T P.501) as source material
▪ Bandwidth from NB (up to 3.4 kHz) to FB (up to 20 kHz)
▪ Level range between 40 and 90 dBSPL
52 conditions per mode (handset and hands-free mode)
4 sentences each
208 test stimuli per mode
Auditory evaluation (1/3)
Stimulus
Binaural recording
Loudness of Transmitted Speech SignalsJan Reimes 6
▪ Absolute / categorial loudness assessment on 25-point scale
▪ 7 anchor definitions for better orientation
▪ Already used in previous studies
▪ 20 normal-hearing test subjects per mode
▪ Hearing-adequate playback of
binaural recordings in listening lab
Auditory evaluation (2/3)
Loudness of Transmitted Speech SignalsJan Reimes 7
▪ Prior to evaluation: determination of individual loudness functions per
test subject with a reference sound
▪ Principle of reference sound: should cause similar “loudness
excitation” as speech, but independent of language, content, talker, …
▪ Three different reference sounds were evaluated:
▪ 1 kHz Sine tone (refers to definition of sone/phon)
▪ 1 Bark noise at 1 kHz (used in initial P.Loudness experiments)
▪ 3 Bark noise at 1 kHz (less tonal, “smooth”)
Auditory evaluation (3/3)
Loudness of Transmitted Speech SignalsJan Reimes 8
▪ Several state-of-the-art loudness models are evaluated:
▪ Zwicker: ISO 532-1
▪ Moore/Glasberg: ANSI S3.4 (stationary), version 2002 & 2016,
LT/ST smoothing
▪ P.Loudness candidate (stationary)
▪ Non-stationary models provide loudness vs. time curve, several
single value calculations are possible:
▪ Average
▪ N5 percentile (peak-oriented)
▪ LL(p) (used in recent work)
▪ Auditory results of test stimuli provide
values on point-scale
Comparison to loudness models?
Results of loudness models (1/5)
Loudness of Transmitted Speech SignalsJan Reimes 9
▪ Proposed procedure for comparison between loudness models
(results in phon/sone) and auditory test results (in points)
▪ Select reference signal (Sine, 1 Bark noise, 3 Bark noise, …)
▪ Calculate inverse of loudness functions with mapping function
▪ Transform auditory results in points to level in dBERL
▪ Example: 15.0 point in listening test refers to ≈75 dBERL
(same loudness as 3 Bark noise reference signal at 75 dBSPL)
Results of loudness models (2/5)
Loudness of Transmitted Speech SignalsJan Reimes 10
▪ Proposed procedure for comparison between loudness models
(results in phon/sone) and auditory test results (in points)
▪ Select loudness model and single value aggregat
▪ Calculate loudness (in sone or phon) for selected reference
sound for a certain level range (e.g. from 40 to 90 dBSPL)
▪ Calculate mapping function between sone/phon and level
▪ Run loudness model on signal-under-test
▪ Transform output from sone/phon to level in dBERL with
previously determined mapping function
Results of loudness models (2/5)
Loudness of Transmitted Speech SignalsJan Reimes 11
▪ Large amount of combinations possible (models, single values,
reference signal)
▪ Evaluation of prediction performance by RMSE*
▪ Considering uncertainty of auditory data
▪ “Baseline” performance: auditory results vs. active speech level
(ASL) acc. to ITU-T P.56 models should perform better!
Results of loudness models (3/5)
ASL/Sinus (HF)ASL/Sinus (HS)
Loudness of Transmitted Speech SignalsJan Reimes 12
Results of loudness models (4/5)
Selected results per loudness model – handset mode
ISO 532-1/Avg./Sinus TVL2016-LT/Avg./3 Bark
TVL2002-ST/Avg./1 BarkP.Loudness/3 Bark
Loudness of Transmitted Speech SignalsJan Reimes 13
Results of loudness models (5/5)
Selected results per loudness model – hands-free mode
P.Loudness/3 Bark
ISO 532-1/Avg./Sinus
TVL2002-LT/N5/3 Bark
TVL2016-LT/LL(p)/3 Bark
Loudness of Transmitted Speech SignalsJan Reimes 14
▪ Loudness assessment is a challenging task!
▪ SWB/FB terminals are commercially available – but currently no
instrumental loudness assessment test methods available
▪ Large auditory database and listening tests were conducted
▪ Considering state-of-the-art terminals and realistic simulations
▪ Evaluation of loudness models – no clear “winner”:
▪ ISO 532-1 very accurate for HS & HF single model for both
▪ TVL2016-LT slightly worse, but considers binaural inhibition
▪ P.Loudness candidate also performs adequately, but…
▪ New loudness model not necessarily needed?
▪ Finalize P.Loudness work item in standardization
▪ Specify application of loudness models in measurement standards
Summary & Conclusions
Jan Reimes
Research & Standardization HEAD acoustics
www.head-acoustics.de © Copyright HEAD acoustics GmbH