Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | branden-tucker |
View: | 214 times |
Download: | 0 times |
New Models for Perceived Voice Quality Prediction and their
Applications in Playout Buffer Optimization for VoIP Networks
University of PlymouthUnited Kingdom{L.Sun; E.Ifeachor}@plymouth.ac.uk
Dr. Lingfen SunProf Emmanuel Ifeachor
ICC 2004, Paris France, 20-24 June 2004 2
Outline
Background Speech quality for VoIP networks Current status Aims of the project
Main Contributions Novel non-intrusive voice quality prediction models Novel perceptual-based speech quality optimization (e.g. jitter
buffer optimization) mechanism Conclusions and Future Work
ICC 2004, Paris France, 20-24 June 2004 3
Background – Speech Quality for VoIP Networks
VoIP speech quality: end-user perceived quality (MOS), an important metric.
Affected by IP network impairments and other impairments. Voice quality measurement: subjective (MOS ) or objective
(intrusive or non-intrusive)
SCN SCNIP Network
Gateway Gateway
SCN: Switched Comm. Networks (PSTN, ISDN, GSM …)
End-to-end Perceived speech quality
Intrusivemeasurement
Non-intrusivemeasurement
MOS
MOS
Reference speech Degraded speech
ICC 2004, Paris France, 20-24 June 2004 4
Current Status and Problems
Lack of an efficient non-intrusive speech quality measurement method E-model (a complicated computational model) Based on subjective tests to derive models/parameters, time-
consuming and expensive. Only limited models exist Lack of perceptual optimization control methods
only based on individual network parameters for buffer optimization and QoS control purposes
not perceptual-based optimization control
ICC 2004, Paris France, 20-24 June 2004 5
Aims of the Project
IP Network
ReceiverVoice source
Voice receiver
Encoder
Sender
PacketizerJitter
bufferDecoder
De-packetizer
Non-intrusivemeasurement
MOS
End-to-end perceived voice quality (MOS)
To develop novel and efficient method/models for non-intrusive quality prediction,
To apply the models for perceptual-based optimization control ( e.g. buffer optimization or adaptive sender-bit-rate QoS control).
ICC 2004, Paris France, 20-24 June 2004 6
Novel Non-intrusive Voice Quality Prediction
Based on intrusive quality measurement (e.g. PESQ) to predict voice quality non-intrusively which avoids subjective tests.
A generic method which can be applied to audio, image and video.
VoIP Network
New model
(packet loss, delay, codec …)
Predicted MOSc
PESQ
E-model Measured MOScdelay
MOS(PESQ)
Reference speech Degraded speech
Intrusive method
(regression or ANN models)Non-intrusive method
ICC 2004, Paris France, 20-24 June 2004 7
New Structure to Obtain MOSc
PESQ can only predict one-way listening speech quality (expressed as MOS).
By a new combined PESQ/E-model structure, a conversational speech quality (MOSc) can be obtained as Measured MOSc.
PESQ
Delay model
MOS R Ie
Ie
End-to-end delay
E-modelMOSc
Id
Reference speech
Degraded speech
MOS (PESQ)
ICC 2004, Paris France, 20-24 June 2004 8
Regression based Models (1)
Nonlinear regression models are derived for Ie based on PESQ/PESQ-LQ
Further combine Ie with Id to obtain MOSc.
MOS (PESQ)
Ie model
Ie
E-modelMOSc
Id modelId
Delay (d)
CodecPacket loss
Reference speech
Degraded speech
Speechdatabase
Encoder Loss model Decoder
Nonlinear regression model (Ie model) Predicted Ie
PESQ/PESQ-LQ
MOS RIeMeasured Ie
(a)
(b)
ICC 2004, Paris France, 20-24 June 2004 9
Regression based Models (2)
Ie can be modelled by a logarithm fitting function with the form of
Parameters for different codecs (PESQ) cbaIe )1ln(
Parameters AMR(H) AMR(L) G.729 G.723.1 iLBC
a 16.68 30.86 21.14 20.06 12.59
b*100 30.11 4.26 12.73 10.24 9.45
c 14.96 31.66 22.45 25.63 20.42
ICC 2004, Paris France, 20-24 June 2004 10
Regression Models for AMR (12.2Kb/s)
96.14)3011.01ln(68.16 eI
e.g. for AMR (12.2Kb/s),
The goodness of fit is:
SSE = 2.83 and R2 = 0.998
MOS vs. packet loss and delay
ICC 2004, Paris France, 20-24 June 2004 11
Perceptual-based Buffer Optimization
Motivation: only based on individual network parameters (e.g. delay or loss) targeting only minimum average delay or minimum late arrival loss,
not maximum MOS. There is a need to design buffer algorithm to achieve optimum
perceived speech quality.
Contribution A perceptual-based optimization jitter buffer algorithm
o Use regression based models for buffer optimizationo Use a minimum impairment criterion instead of traditional maximum
MOS scoreo A Weibull delay distribution based on trace analysiso A perceptual-based optimization of playout buffer algorithm
ICC 2004, Paris France, 20-24 June 2004 12
Impairment Function Im Define: impairment function Im
parameters related codec are and 0 if 1)(
0 if 0)(
)1ln()3.177()3.177(11.0024.0
),(
baxxH
xxHwhere
badHdd
IIdfI edm
rdnnnnbn edXP )/)(()100()()100(
Playout delay d
Weilbull distributionbuffer loss
b
ICC 2004, Paris France, 20-24 June 2004 13
Minimum Impairment Criterion Define: minimum impairment criterion
Given: network delay dn, network loss n and codec type
Estimate: an optimized playout delay dopt
Such that: minimize Im can be reached.
d1 d2 d3
d4
Minimum Im
ICC 2004, Paris France, 20-24 June 2004 14
Perceptual-based Optimization Buffer Algorithm
For every packet i received, calculate network delay ni
If mode == SPIKE then
if ni tail*old_d then
mode = NORMAL
elseif ni > head*di then
mode = SPIKE; old_d = di
else
-update delay records for the past W packets
endifAt the beginning of a talkspurt
If mode == SPIKE then
di = ni
else
-obtain (, , ) for Weilbull distribution for the past W packets
-search playout d which meets minimum Im criterion
endif
ICC 2004, Paris France, 20-24 June 2004 15
Performance Analysis and Comparison (1)
Selected five traces from UoP to CU (USA), DUT (Germany), BUPT (China), and NC (China).
Traces 1 and 3 with high delay variation and traces 2, 4, 5 with low delay variation
Trace Delay (ms)
Jitter (ms)
Loss (%)
1 153 16.2 1.1
2 46 0.8 0.3
3 186 19.5 14.3
4 16 0.7 4.4
5 150 0.2 0.2
ICC 2004, Paris France, 20-24 June 2004 16
Performance Analysis and Comparison (2)
“p-optimum” algorithm achieves the optimum voice quality for all traces.
“adaptive” algorithm achieves sub-optimum quality with low complexity.
Performance comparison for buffer algorithms
0.5
1
1.5
2
2.5
3
3.5
4
1 2 3 4 5
Traces
MO
S
exp-avg
fast-exp
min-delay
spk-delay
adaptive
p-optimum
ICC 2004, Paris France, 20-24 June 2004 17
Conclusions and Future Work
Conclusions The development of a new methodology and regression models to
predict voice quality non-intrusively. Demonstrated the application of new non-intrusive voice quality
prediction models to perceptual-based optimization of playout buffer algorithms.
Future Work To consider buffer adaptation during a talkspurt in order to achieve
the best trade-off between delay, loss and end-to-end jitter. To extend the work to improve the performance of multimedia
services (e.g. audio/image/video) over IP networks
ICC 2004, Paris France, 20-24 June 2004 18
Contact Details
http://www.tech.plymouth.ac.uk/spmc Dr. Lingfen Sun
[email protected] Prof Emmanuel Ifeachor
[email protected] Any questions?
Thank you!