Effects of ATM network impairments on audio-visual broadcast applications

Effects of ATM network impairments on audio-visual broadcast applications

D.Patel and L.F.Turner

Abstract: The authors provide a description of research conducted with MPEG-1 and MPEG-2 compressed broadcast material transmitted in the presence of network impairment factors, in the form of payload bit error and cell loss. The study is different from other reported work in that it incorporates a complete user-perceived quality model, which takes into account the effects of concurrent impairments on both the audio and video streams of an audio-visual stream. The joint evaluation provides a more comprehensive model of user-perceived quality acceptance levels of broadcast audio-visual applications and as such provides valuable ATM network design data.

1 Introduction

With increasing interest in applications using digital compressed audio-visual communications, a growing need exists to assess and quantify the seriousness of degradation on the perceived user quality, resulting fiom impairments such as network cell loss and payload bit error in an asynchronous transfer mode (ATM) environment. Some work [l-91 has been carried out to investigate the subjective effect of such impairments, but this is concerned with the video quality of the received signal only. Very little has appeared in the open literature [lo] on the assessment, from the end-user point of view, of the combined effects of bit errors and cell loss on audio and video quality, and on the subjective quality of combined perturbed video and associated audio for broadcast applications.

The aims of the research reported in this paper are to evaluate, using subjective testing, the audio quality, the video quality and the perceived overall quality of broadcast services that are subject to cell loss and payload bit error in both the audio and video streams. A series of subjective tests has been conducted on broadcast-type applications over a simulated ATM network, using both MPEG-1 and MPEG-2 compressed material, and relationships have been determined relating to both the audio and video quality to the overall perceived quality and the interaction effects when multiple impairing factors are present simulta- neously.

2 Description of experimental set-up

The first of the tests was conducted with material compressed in accordance with the MPEG-1 standard, However, as commercially based broadcast audio-visual

0 IEE, 2000 IEE Proceedings online no. 20000474 DOL 10.1049/ip-vis:20000474 Paper first received 2nd July 1999 and in revised form 21st March 2000 The authors are with the Department of Electrical & Electronic Engineering, Imperial College of Science, Technology & Medicine, Exhibition Road, London SW7 2BT, UK

436

applications require a much better quality image than that produced with MPEG-1, the testing was extended to deal with material compressed according to the MPEG-2 standard. The transport stream-multiplexing format of the resultant higher bit-rate stream was chosen as it is specifically developed for such applications. The extensions of the test to the MPEG-2 format enabled comparisons to be made between the user models obtained from the two experiments.

In the tests the network impairments investigated, cell loss and payload bit error, were simulated in software and the MPEG compressed test material was prepared prior to conducting the tests. To meet the objectives of the one-way tests, cell loss and payload bit error software modules were developed. However, the requirement to impair the audio and video streams individually required the MPEG sequences to be split and reconstructed into their constituent audio and video streams. The entire system was simulated using a three-stage process.

In the first stage, the compressed audio-visual stream, stored as a computer file, was passed through a MPEG demultiplexer. This unit demultiplexed the combined audio-visual data into its constituent audio and video streams, and created a separate file comprised of system information, which was required for the reconstruction process. The second stage, involved passing the individual audio or video streams through a bit error and then a cell loss module, which, respectively, introduced controlled amounts of bit error and cell loss into the stream. The resultant impaired audio and video files were then passed through a MPEG multiplexer and, using the system information file from the first stage, the two were combined into a compliant MPEG audio-visual stream.

A bursty-loss model was used to simulate the loss effects associated with bursty VBR traffic. This was done to simulate the packet losses that occur in ATM video transmission due to sporadic network overload [ 1, 2, 1 1 - 131. The distribution of the number of cells lost during a burst of losses was drawn from a Poisson process, with the probability P(r),of obtaining a burst of r cells being given by P(r) = ;I’.e-”/v!. The number of packets within a burst of lost cells was chosen to be r + 1.

The period between bursts of losses was set so that it was governed by a uniform distribution. To achieve a

IEE Proc.-Vis. Image Signal Process., Vol. 147, No. 5, October 2000

desired cell loss percentage of y, with a mean burst length of ,I cellshurst period it was necessary that the uniform distribution be over the range {O, 2 x 100 x l l y x A}, assuming y << 1. In the tests y ranged from lo-* to

In the test involving MPEG-1 compressed audio-visual material, the mean burst cells value 2 was set to be three, and the maximum cell loss burst length was limited to ten cells. This figure had to be used since preliminary testing had shown that in some cases, the decoder card, which was a Phillips PCAllMP, was not capable of handling burst losses in excess of ten cells. As the test involving MPEG-2 compressed audio-visual test material was conducted to investigate the same differences, the same cell loss characteristics were employed in the MPEG-2 tests. The bit error module was designed to introduce payload bit errors into the payload of the ATM cell at random with a probability p , which could be varied as desired.

3 Subjective test design

The extent to which impairments are perceptible clearly depends on the type of the material being considered [ 3 , 14-17]. At present, there are no recommended standard audio-visual sequences that can be used in assessing performance and comparing systems. However, in the ITU-T Recommendation P.910 [lS], a scheme for the classification of digital video-only sequences is described. In the recommendation, sequences are classified according to their spatial and temporal characteristics. The spatial perceptual information measurement, SI [ 1 SI, based on the Sobel filter, is a measure of the spatial information of a picture, or frame. The higher the SI value the more spatially complex is the scene. The temporal perceptual information measurement, TI [IS], is a measure that indicates the amount of temporal change within a video sequence, with this being higher for high-motion sequences. Test sequences can be classified and their relative temporal and spatial characteristics directly compared by plotting them on a SI-TI plot [lS].

MPEG encoders have the property of compressing the sequence using the natural temporal and spatial redundancy of the source material. However, if a constrained bit- rate is set then the encoder achieves the target bit-rate by making a series of optimisations on the picture until the bit-rate is reduced to the required level, which inevitably introduces degradation into the resultant picture. Hence, the encoding parameters were set to give optimal quality without a set constraint on the bit-rate of the resultant digital sequence. This encoding process allowed the encoder to maximise the use of the inherent redundancy of the source material in the compression process, without compromising the quality of the encoded sequence. Gener- ally, the contents of faster clips are more difficult for the MPEG coder to handle because they contain less redundant information than the slower clips.

Table 1: Levels of factors in main tests

In both tests, the experimental investigation involved varying levels of bit error and cell loss on both the audio and video streams of different programme types. The resultant degraded audio-visual combinations were then tested for their subjective quality repeatedly to obtain the mean quality values, the mean opinion scores (MOS), for each sequence.

(i) programme type (Pt) (ii) audio cell loss (Acl) (iii) audio bit error (Abe) (iv) video cell loss (V,,) (v) video bit error (V,,).

The selection of the number of levels is of critical importance in the designing of a test. Pilot tests were conducted to obtain an insight into the perceptibility of degradation on such sequences. On the basis of the pilot test results, the number of levels of each factor was chosen to be three, to minimise the magnitude of the test design. This figure was selected, as this is the minimum number of levels required to identify any two-factor interactions within the resultant statistical models. The factor levels described in Table 1 were selected with the following considerations in mind:

Level 1 represents the ‘control’ level (an impainnent- free network). Level 2 represents an error, loss-event or delay setting such that on average the subjects in the pilot tests rated the degradation at this level as ‘perceptible’ (level 4 on the impairment scale). Level 3 was selected on the basis that the mean opinion score (MOS) for that particular impairment level represented ‘slightly annoying’ (level 3 on the impairment scale). This criterion was used since further preliminary testing had indicated that the combinations of loss and error would yield overall test results that cover the entire scale of quality assessment.

In keeping with the balanced test design, three levels of the programme type factor were chosen. The three sequences used in the MPEG-1 (MI) tests were i) ‘Stella Artois’ (M‘Pl), which represents material of low temporal nature, with occasional changes in the picture. The major- ity of the sequence consists of head-and-shoulders views, with most of the screen area being static. With static material, once a frame has been encoded, subsequent frames are rarely coded; hence the effects of cell loss and bit errors do not accumulate but are repeated. The accompanying audio is subtle, and consists of background music, with some voice. The second sequence, ‘The National Lottery’ (’‘P2) exhibits a mixture of fast and slow motion. In addition, computer-generated graphics are also present. This type of sequence is important in the testing procedure, since the discrete cosine transform (DCT)- based coding of the MPEG algorithm is known to be better at encoding natural images than computer-generated images [ 191. The accompanying audio consists of narration

The factors involved in the one-way tests were:

Experiment f f (mean cells/frame) A,, V C I V b e

MPEG 1 level 1 54 level 2 135 level 3 266

MPEG-2 level 1 927

level 2 1004 level 3 1051

0 0 0 0 8 x 5 x 2 x 5 x

5 10-5 2 10-4 2 10-5 2 10-4 0 0 0 0

9 x 10-8 5 x 10-6 5 x 10-8 5 x 10-6 6 x IO-’ 3 x 2 x IO-’ 3 x IOp5

IEE Proc.-Vis. Image Signal Process.. Yol. 147, No. 5, October 2000 437

and background music that is synchronised to the scene changes in the video. The third sequence, ‘Adidas Preda- tor’ (‘‘P3) is the extreme case, where the motion is very fast with more than 12 scene changes taking place in a 20-s period. Intraframe coding is very common during scene changes with the result that cell losses and bit errors introduce degradations that accumulate and hence can be very disturbing.

The sequences for the MPEG-2 (M2) tests were selected with similar considerations to those used when selecting for the MPEG-I tests. The sequence exhibiting low temporal characteristics was the ‘Horse’ (M2P1), taken from the motion picture ‘Maverick’. This sequence contains very little camera motion and consists of a man riding a horse towards the camera, the background of the shot is still, with the motion consisting only of the horse and the man. The background audio is subtle music with narration. The second sequence was the ‘Stairs’ (M2P2), also taken from the motion picture ‘Maverick’. This sequence consists of both fast and slow motion and includes a camera pan of a man walking down stairs and talking to people at the bottom. The fast motion sequence used was the ‘Stampede’ (M2P3), taken from the motion picture ‘Jumanji’, which consists of computer-generated animals breaking through a wall and rushing through a house. Ten scene changes take place in the 20-s duration. In the tests, the encoded cells per frame value of each test sequence was used to classify the programme-type factor.

The test design was based on a ‘complete design’, in which every factor and every factor level is considered in combination with every other factor level. A combination of factors, each at a particular level, is referred to as a ‘treatment combination’. The five factors each had three levels and the total number of treatment combinations for a complete design was therefore (3’) = 243. From the results of the pilot tests, it was determined that for each combination ten data measurements were required, and hence the total number of sequences tested was 2430. With regard to the duration, the EBU [20] and ITU-R [21] recommend that the test sequence be of 15-20s in duration for television picture assessment. The CCIR [22] recommends similar length sequences for sound assessment. With regard to testing duration it must be remembered that some time is required for the subjects to enter their scores. For this purpose some 5-10 s are required between the sequences. The standard ACR (absolute category rating c.f. single stimulus [21]) method [23] was adopted, as it was best suited to the objectives of the experimental aims. In the tests, each treatment combination was of 20s duration, with a 5-s pause between the sequences for the subjects to enter their scores. The total testing time was 18min 45 s for each subject, which allowed all 2430 treatment combinations to be tested with a total of 45 treatment combinations for each of the 54 subjects.

To minimise learning effects and prevent the familiar- isation of the subjects with the material being presented, it was necessary to randomise various aspects of the presentation. Specifically, randomisation had to be carried out in respect of the subjects and the order in which each subject was presented with treatment combinations. The treatment combinations in the test design were arranged such that all main effects and all two-factor interactions could be estimated free from biases arising from subjects and from the order of presentation of the clips. To this end, each subject was presented with each level of each factor equally often.

The 54 subjects used in the MPEG-I tests were randomly selected from a pool of staff and students of

438

Imperial College. The MPEG-2 tests also used 54 subjects, and these were selected from a pool of staff from the MPEG-2 equipment provider, FDS Ltd., and students of the University of Lancashire. For both experiments, rooms were selected that were free from external noise, and were suited to the purposes of subjective testing. The audio1 video playback and listeninglviewing devices were in perfect operational condition; audio levels were kept constant, and the picture parameters (i.e. contrast, colour, brightness, etc.) were kept identical for all subjects. As recommended in [21] the viewing position was fixed at six picture heights from the screen. The subjects were instructed, both by a written instruction sheet and verbally, that the test comprised a number of audio-visual sequences, and that the audio and video portions of the sequence might be degraded by different amounts. The subjects were instructed to observe the entire sequence and to indicate on a standard five-point quality assessment scale [21] their views as to the acceptability of the audio quality, the video quality and the overall combined audio- visual quality of the sequence. Before commencement of the test proper, each of the subjects was shown samples of 20-s sequences consisting of the three programme types used, both impaired and unimpaired. This was done to acquaint them with the quality of the sequences involved, and with the types of impairments they would encounter. The duration of these sample sequences was the same as the duration of the treatments in the actual test, and hence the subjects could also familiarise themselves with the sequence duration. The pre-test sessions were thought to be essential in informing the subjects as to what should be regarded as ‘excellent’ quality for the system, and provided a reference on the five-point quality scale with which they could compare the subsequently presented material. When the subjects were comfortable with their surroundings and understood their tasks, the relevant sequences were played back for viewing. The total testing procedure involved 45, 20-s sequences one after another, with a 5-s break in- between the sequences, with a black screen, so that the subjects could enter their audio, video and overall audio- visual quality ratings.

4 Statistical analysis of test results

The statistical analysis methods for both series of experiments were identical. The methods employed were designed to identify which of the experimental factors contributed to variations in (i) the audio quality, (ii) the video quality and (iii) the combined overall quality. This was done using the analysis of variance (ANOVA) techni- que. Also, multiregression techniques were employed to quantify the effects of the relevant factors on the respective qualities, and hence produce models for describing these effects mathematically. ANOVA and multiregression analysis methods were also used to assess the effect on the combined overall quality of the individual audio and video quality. The Macanova software package [24] was used for the statistical analysis, since it computed the results for each term by considering all of the terms in the model, as opposed to computing them sequentially (e.g. in the order in which they are presented in the model). This method of statistical computation generally allows for greater preci- sion when interactions are present between factors [24].

The results from the MPEG-1 experiments are given in Table 2, and the MPEG-2 results in Table 3. The models obtained are shown graphically for MPEG-I and MPEG-2 experiments in Figs. 1-7 and Figs. 8-14, respectively. The

IEE Proc.-Vis. Image Signal Process., Vol. 147, No. 5. October 2000

Table 2: Results of MPEG-1 regression analysis

Factor M1 Qa M1 Qv M1 Q, M1 Qoc

p*

A,/

V C ,

E V 4.30 a0 4.91 goc - 0.303 M i M1

M1

4.43 M1

- 6.96 10-4 M1

Constant M1 Ea

- 3.56 10-3 Bo1 B Y 1 - 3.81 10-3 Pa1 M1

M i

M I

Pa2 - 15924 M1 Po2 - 1071.90

fla3 - 4134.90 BO3 - 4352.60 M I

M i

M I

M i

Abe

Bv2 - 25200 BO4 - 19062.00

Bv3 - 2802.50 B O 5 - 4694.50

M1

M1 vbe

Acle.Ahe fia4 - 9.52 x I O 6 006 - 3.66 105 M1

vc/ . vhe M1 Pv4 - 1.28 x 107 M1 P O 7 - 1.60 x IO6

P O C l 0.205

POC2 0.896

M i

M1 Qa

QV

R‘ 0.76 0.74 0.70 0.62

Table 3: Results of MPEG-2 regression analysis

Factor M2 Qa M2 Qv M2 Qo M2 Qoc

Constant

pt Ad

VC/

vbe

v c / , vhe

Qa

Q”

R 2

MV M2 “a 4.8942

Pa2 - 1.265 x I O 6

M2

M2

M2

M2

Ba1 - 7.849 IO-^ M2 P v i

Ba3 - 35255

Pv2

Pv3

Pv4

M2

M2

M2 Pa4 - 4.922 x IOio M2

0.83487 0.90062

a0 M2 4.9498

- 0.0010226 M2 BO1

Po2 M2

M2 B O 3

M2 B O 4

005

- 5.894 x I O 6

- 29963.4 M 2

M2

M2 P O 7

Bo6

- 1.006 x I O 9

0.78053

“oc - 1.50286 M2 5.57127

- 0.001 1436

- 4.416 x I O 5

- 1.382 104

- 4.465 x I O 6

- 3.146 x I O 4

- 3.715 x 1O’O

- 8.458 x IO io

M2Bocl 0.43770

M2 POc2 0.99639

0.79816

0

Stella Artois U National Lottery

Adidas Preditor

Fig. 1 audio bit error rate

Estimated MPEG-I audio qualily against audio cell loss rate and

Stella Artois 0 National Lottery

Adidas Preditor

Fig. 2 video bit error rate

Estimated MPEG-1 video quality against video cell loss Fate and

IEE Proc.-Vis. Image Signal Process., Vol. 147, No. 5, October 2000 439

-4 ( ~ 1 0 )

Fig. 3 averaged over pmgramme type __ A , = 0.0 . . . . . A,, = 0.00005 _-_ A, = 0.00020 Sa, =0.9256 MOS; Sa, =1.1351 MOS

Conditional effect plot of estimated MPEG-1 audio quality

-4 V,l(XlO )

Fig. 4 averaged over progrumme type __ v,, =o.o ..... V,, = 0.00005 _ _ _ V,, = 0.00020 S,, = 0.5733 MOS; 6, = 0.7525 MOS

Conditional effect plot of estimated MPEG-I video qualily

0.8 1 .o 0.2 0.4 0.6 0

Fig. 5 averaged over programme type __ A,, @ level 1 , V, @ level 1, V,, @ level 1 ..... A,, @ level 2, V,, @ level 1, V,, @ level 1 __- Abe @ level 3, V,, @ level 1, V,, @ level 1 _ . _ . Abe @ level I , V , @ level 3, V,, @, level 3 *-* A , @ level 2, V,, @, level 3, v,, @ level 3 0-0 A,, @ level 3, V,, @ level 3, Vbe @ level 3 So, =SOT = 0.8778 So, = Sd = 0.9401

440

Conditional effect plot of estimated overall MPEG-1 qua&

0 0.2 0.8 1 .o 0.4 -,0.6 V,I W O ')

Fig. 6 averaged over programme type ~ Vbe @ level 1 , A,, @ level I , A,, @ level 1 . . . . . V, @ level 2, A,, @ level 1, A,, @ level 1 -_- V,, @ level 3 , A,, @ level 1 , A,, @ level 1

V,, @ level 1 , A,, @ level 3, A,, @ level 3 *-* V,, @ level 2, A,, @ level 3, A,, @ level 3 0-0 Vbe @ level 3, A , @ level 3, A , @ level 3 So, = 0.9421 So, = 0.9045

Conditional effect plot of estimated overall MPEG-I quality

a,, = 0.9694 So, = 0.9614

Fig. 7 Audio video contribution to ovemN MPEG-1 quality

0 horse 0 stairs

stampede

A,, (xlf)

Fig. 8 audio bit error rate

Estimated MPEG-2 audio quality against audio cell loss rate and

IEE Proc.-Vis. Image Signal Process., Vol. 147, No. 5, October 2000

. 3.0

z 5 -

z - 0

m

0

Q u -

$pm 4 -

.- 2.

s 3 - -

.-

P 2-

E % - .- c $ 1

0 horse 0 stairs

stampede

........... ....... 8ai

6a2

-- ---_ -- -- --- --- --- --- -- I--_

- V,I ( X I 0 )

I I

v,, (xi<)

Fig. 9 Estimated MPEG-2 video quality against video cell loss rate and video bit error rate

......................... .............................. ..........

s3 --- c

go - .- 0 4 - -_ 801 g 5 g ---_ --- - -

-7 Acl (x i0 1

Fig. 12 averaged over programme type - A , @ level 1, Vcr @ level 1, V,, @ level 1

_ _ _ A,, @ level 3, V , @ level 1, Vb, @ level 1 A,, @ level 1, V, @ level 3, Vbe @ level 3

*-* A be @ level 2, V,, @ level 3, Vbe @ level 3 0-0 A , @ level 3, V,, @ level 3, vb , @ level 3

Conditional effect plot of estimated overall MPEG-2 quality

. . . . . A , @ level 2, V,, @ level 1, vb, @ level 1

So, =OS261 h02 = 1.0834 6,, =OS261 So, =0.8605

I

.- E - I I I I I I I I I ,

0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 I 1

-7 v,, W O )

Fig. 11 averaged over programme type __ Vbe = 0.0

Conditional eflect plot of estimated MPEG-2 video quality

. , . . Vb, = 0.000005 _ _ _ V,, = 0.000030 S,, = 0.9592 MOS b,2 = 1.4419 MOS

IEE Pmc.-vis. Image Signal Pmcess., Vol. 147, No. 5, October 2000

ML

audio quality Qa

I I

5

Fig. 14 Audio video contribution to overall MPEG-2 quality

441

Tables relate to the multiregression results, each factor investigated (column I), is associated with a regression coefficient shown in subsequent columns, depending on the model being investigated. The last rows in these Tables give the calculated coefficient of determination R2, which is a measure of the significance of the regression model and determines how much of the variance in the data is accounted for by the model.

4.7 Audio and video quality models The statistical evaluation of the audio and video quality models are based initially on identifying the factors and factor interactions that are statistically significant in explaining any variations in audio and video quality. The functional relationships for these models can be written as

Q, =.f[Acl, A b e , VC,, V b e ? p t l (1)

Qv =f [Acl, Vc13 Vbel ‘*I (2)

for the audio quality model, and

for the video quality model. In these equations A,, denotes the audio cell loss, A,, the

audio bit error, V,, the video cell loss, Vhe the video bit error and P, the programme type. These functional relationships relate to the evaluation of both the MPEG-1 and MPEG-2 experimental data.

From the ANOVA results for both the MPEG-1 and MPEG-2 data, it was found that the P,, A,, and A,, factors are statistically significant in explaining variations in audio quality, and the factors Pi, V,, and Vbe are statistically significant in explaining variations in video quality. A single factor interaction is found statistically significant for each model, the interaction between A,, and A,, for the audio quality model and the interaction between V,, and V,, for the video quality model.

The resultant audio and video regression models are

Q, = M k ~ , + M k P a l . P, + M k p a 2 . A,, Mk

Mkpu3 . A,, f M k p a 4 . A,, .A,, (k = 1,2) (3)

and

MkQ, = Mkcxv + Mk&,l Pi + M k f l v 2 . V,,

+ Mk/jv3 ’ V b e + M k p ~ 4 ’ Vcl . Vb, (k = 1 > 2) (4)

respectively, where M k p u I , to M k l j a 4 , and Mk/3yl to Mkp,,4 are the corresponding regression coefficients, and Mkaa and M k ~ , are the constants or ‘intercepts’. From the ANOVA results, eqns. 3 and 4 can be applied to both MPEG-1 and MPEG-2 experimental data. The M k notation indicates the data to which the equationslcoefficients relate, with k = 1 denoting MPEG-I data and k= 2 MPEG-2 data.

The results of the multiregression analysis can be found in the second and third columns of Table 2 for the MPEG-1 audio and video models ( M 1 c x , , Miav, M’/?al to M 1 p a 4 and M I P V I to M1j?v4) and the second and third columns of Table 3 for the MPEG-2 audio and video models (M2a,, M2a,, M2/j,l to M2p,4 and M2pv1 to M2pv4).

The degradation of the audio and video quality increases with increasing levels of cell loss and bit error in both the MPEG-1 (Figs. 1 and 2, respectively), and MPEG-2 (Figs. 8 and 9, respectively) results.

The conditional effect plots Figs. 3 and 4 show, respectively, the effects of different levels of audio cell loss and bit error rate on the MPEG-1 audio quality averaged over programme type, and video cell loss and bit error rate on MPEG-1 video quality averaged over programme type. The corresponding conditional effect plots for the MPEG-2

442

experiments are given in Figs. 10 and 11. In these Figures, it can be seen that for the MPEG-1 experiment the interaction effects that were found by ANOVA analysis to be significant, are small for the ranges concerned. However, the same interaction effects for the MPEG-2 data are considerably larger. The interaction between the cell loss and bit error factors, for both audio and video quality (in both MPEG-1 and MPEG-2 experiments), accounts for the increased rate of degradation when the highest levels of cell loss and bit error are combined. As the interaction terms are small for the MPEG-1 data, Figs. 1 and 2 are very nearly planar; the interaction can be seen more clearly with the MPEG-2 data in Figs. 10 and 11, which exhibit nonplanar characteristics.

4.2 Combined overall quality The general functional relationship between the combined overall quality and the independent variables used in the ANOVA analysis is

The ANOVA results found that the five factors A,, , Ahe, Vel, V,, and P, are statistically significant in their effect on the overall quality for both the MPEG-1 and MPEG-2 data. The results show that all two-factor interactions are statistically insignificant, except for the interactions between A,, and Abe, and between V,, and v&,. The related regression equation is

MkQo = Mkao + Mk/301 . P, + Mk/3n2 . A,,

+ MkPo3 ‘ + MkPo4 ‘

+ MkPo5 ’ Vbe + Mkfio6 . ’

+ Mkp07 . I/,, ’ V/he ( k = 1, 2) (6)

where Mk/j’o, to Mkpo7 inclusive, are the corresponding regression coefficients and Mkao is the constant or ‘intercept’. The associated regression analysis results are given in the fourth column of Table 2 for the MPEG-1 data and in the fourth column of Table 3 for the MPEG-2 data.

The conditional effect plots shown in Figs. 5 and 6 illustrate the relationships between the overall MPEG- I quality and the independent variables. In these Figures the ‘levels’ in the key relate to the corresponding values given in Table 1. With regard to the overall MPEG-2 quality, conditional effect plots are shown in Figs. 12 and 13. It is clear from these Figures that the overall quality Mk Q,(k= 1,2) becomes increasingly degraded with increasing levels of Ac, , A,, , Vc,, and Vbe, and that the video factors V,, and Vbe have greater degrading effects on the overall quality than the audio factors A,, and A,, . Also from these Figures it can be seen that the interaction effect between the cell loss and bit error factors increases as higher levels of payload bit error and cell loss are combined. The magnitude of interaction effects is greater for MPEG-2 than MPEG- 1. This can be seen by noting that

< do,, do, < do43 6,s < do, and ‘07 < 6 0 8 .

4.3 Effects of combined audio and video losses on perceived overall quality In designing a practical network capable of transmitting audio-visual signals, it is important to be able to under- stand how the quality of the audio and video, and any assorted errors or losses, contribute to the overall perceived quality. Using the test results obtained it is possible to provide important information in this respect. As indicated, three scores relating to the audio score, the video score and

IEE Proc.-Yis. Image Signal Process., Vol. 147, No. 5, Ociober 2000

the overall quality score were obtained for each sequence presented. Using an ANOVA, relationships between the various scores can be determined. From the ANOVA results it is found that the interaction between the audio and video is not statistically significant in either the MPEG-1 and MPEG-2 models. Thus, the overall quality MkQoc can be related to the individual overall audio and overall video quality through the regression equation

Q,, = Mkaoc + MkPocl QR + Mk/30c2Qv (7) Mk

where Mkpocl an%kMkp0c2 are the corresponding regression coefficients and

The results of a regression analysis of the data from the MPEG-1 and MPEG-2 models are given in the fifth columns of Table 2 and 3, respectively. If it is assumed that Mkaoc + Mkj,clQR + MkPoc2Qv 2 5 are taken as giving overall quality Q,, = 5, and that Mkaoc + MkDoclQ, + MkPoc2Qv 5 1 results in an overall quality Q,, = 1, then using the data from Tables 2 and 3, eqn. 7 can be plotted as in Figs. 7 and 14.

The Figures show contour plots of the interrelation between Q, and Q v , with the contour line separating the plane given by the equation M k ~ , , + Mkj?ocl Q, + Mkjoc2Qv = 3. These Figures provide a direct method for determining the values of MkQR and MkQv that result in an overall quality MkQoC, that is ‘fair’ or better.

U,, is the constant or ‘intercept’.

5 Discussion on findings and a comparison of MPEG-1 / MPEG-2 systems

Figs. 1 and 18, indicate that the programme-type factor has a very mild influence on the audio quality for any given level of bit error rate and cell loss. This may be due to the MPEG audio-coding algorithm being independent of the temporal nature of the audio sequence [25]. In Figs. 2 and 9, noticeable differences can be seen in respect of the subjective video quality for different programme types, with greater sensitivity in the ‘fast’ sequences than in the ‘slow’ sequences. It can therefore be deduced that if a sequence with a low-temporal content were to be transmitted along with a sequence having a relatively high- temporal content, the former would be able to withstand greater degrees of loss and error for a given level of reproduced quality. Clearly, sequences are most vulnerable to the effects of network overload during scenes in which there is vigorous motion. However, with the higher quality MPEG-2 sequences, individual degradations are more visible in the sequence and hence are more easily distin- guished by the subjects. As a consequence, the magnitude of the interaction effects can be seen to be greater for the MPEG-2 sequences than the MPEG-I sequences, which suggests that more stringent controls are required to main- tain quality levels when multiple impairing factors are present.

In an MPEG system stream, the audio and video are multiplexed together and therefore, provided the stream is sufficiently long, as will be the case with broadcast material, the respective stream cell loss rates and the stream bit error rates will be the same for both the audio and video component streams, which means that A,, = V,, = C, and A,, = V,, = B, . In this case the original regression equation (eqn. 6) can be rewritten as

Using the regression coefficients derived from the experiments, and by using the results arrived at from eqn. 8, the following conclusions can be made:

In the case of MPEG-1 the values of C, and B, are small for the ranges concerned, and the term (Io6 + po7).CI .Be is negligible when compared with the other terms in the equation. In the case of MPEG-2 however, the interaction effect can be shown to be more significant.

This equation provides direct information on how the overall quality varies with cell loss rate and bit error rate, and hence provides information on the cell loss and bit error rates needed to obtain a specified level of overall perceived quality.

Hence, if the temporal properties (programme type characteristics) of a sequence were known, it is a straight- forward method to calculate the combinations of maximum contribution of cell loss rate and bit error rate to achieve a desired MOS quality grade. It can also be shown that the programme type has some effect on the overall quality. The difference, however, is small and hence a sensible system design parameter would be to assume that the programme type was of the highest level of activity, in the knowledge that any other type would result in a superior quality.

From models obtained it is clear, using the regression coefficients obtained from this study, that although perceived overall quality is dependent on both audio and video quality, its sensitivity to impairments is heavily biased by the quality of the corresponding video.

6 Conclusions and future work

A series of subjective tests have been performed to deter- mine the combined effects of cell loss and bit error on the quality of MPEG-1 and MPEG-2 broadcast systems. Multiregression models have been formed to describe the degradation of the individual user-perceived audio and video quality, and the overall combined quality of MPEG- 1 and MPEG-2 broadcast applications transmitted over ATM networks in the presence of cell loss and bit error. Models have also been developed to describe the user-perceived evaluation of overall combined quality in terms of the audio and video quality of the received sequence. In general, the models formed showed that the underlying relationship between the impairing factors and the measured quality are similar for both MPEG-1 and MPEG-2 systems. Interaction effects between the impairing factors have been identified and quantified, and found to be more prominent for the higher quality MPEG-2 sequences, which suggests that more stringent controls may be necessary in determining quality levels when multiple impairing factors are present. Investigations are currently underway into the performance of real-time, two- way systems.

7 Acknowledgments

The authors would like to thank Dr. L. White, Department of Mathematics, Imperial College, London, for her assis- tance in the design of the tests, and help in the randomisation and ordering of the treatment combinations. Also due acknowledgements are given to Mr. N. Walmsley of FDS Ltd., Preston, Lancashire, (http://www.mpeg.co. uk/) for supplying the MPEG-2 equipment, premises and technical expertise, which was invaluable in completing the MPEG-2 experiments.

443 IEE Proc.-Vis. Image Signal Process.. Vol. 147, No. 5, October 2000

http://www.mpeg.co

8

1

2

3

4

5

6

7

8

9

10

11

12

References

YAMAZAKI, K., WADA, M., TAKASHIMA, Y., and WAKAHARA, Y.: ‘ATM networking and video-coding techniques for QOS control in B-ISDN’, IEEE Trans. Circuits Syst. Video Technol., 1993,3, (3), pp. 175-181 IAI, S., and KITAWAKI, N.: ‘Effects of cell loss on picture quality in ATM networks’, Electron Commun. Jpn. 1, Commun., 1992, 75, (10) HUGHES, C.J., GHANBARI, M., PEARSON, D.E., SEFERIDIS, V, and XIONG, J.: ‘Modeling and subjective assessment of cell discard in ATM video’, IEEE Trans. Image Process., 1993, 2, (2), pp. 212-222 SEFERIDIS, V, GHANBARI, M., and PEARSON, D.E.: ‘Forgiveness effect in subjective assessment of packet video’, Electron. Lett., 1992,

GHANBARI, M., and HUGHES, C.J.: ‘Packing coded video signals into ATM cells’, IEEE/ACM Trans. Netw., 1993, 1, ( 5 ) CHEE HENG, T., and ZHANG, L.: ‘Effects of cell loss on the quality of service for MPEG video in ATM environment’. Proceedings of IEEE Singapore International Conference on Networks / International Conference on Information Engineering IEEE SICONOCIE, 1995,

JEAN, S. et al.: ‘QOS parameter translation for the MPEG services between layers in ATM networks’. Department of Computer Science for Artificial Intelligence Research, Korea Advanced Institute of Science Technology RILEY, M.J., and RICHARDSON, I.E.G.: ‘Quality of service issues for MPEG-2 video over ATM’. Proceedings of International Broadcast Convention, 12-16 September 1996, pp. 583-587 SCHWARTZ, M.et al.: ‘Quality of service requirements for audio- visual multimedia services ATM94-0640, ATM Forum, 1994 BEERENDS, J.G., and DE CALUWE, F.E.: ‘The Influence of video quality on perceived audio quality and vice versa’, 1 Audio Eng. Soc.,

PRYEKER, M.: ‘Asynchronous transfer mode: Solution for broadband ISDN (Ellis Honvood, New York, 1993,2nd edn.) MEKY, M., and SAADAWI, T.N.: ‘Degradation effect of cell loss on speech quality over ATM networks’. Proceedings of the International

28, (21), pp. 2013-2014

pp. 11-15

1999, 47, (5), pp. 35.5-362

13

14

15

19

20

21

22

23

24

25

IFIP - IEEE Conference on Broadband Communications, Canada,

KARLSSON, G.: ‘Asynchronous transfer of video’, IEEE Commun.

BAUER, S.: ‘The influence of impairments from digital compression of video signal on perceived picture quality’. Proceedings IWISP ’96, Manchester, November 1996 HIDAKA, T. et al.: ‘The actual MPEG subjective assessment Test’, 1 Inst. Tela: Eng. Jpn., 1995, 49, (4), pp. 03866831 (In Japanese) LOURENS, J.G., MALLESON, H.H., and THERON, C.C.: ‘Optimisa- tion of bit rates, for digitally compressed television services, as functions of acceptable picture quality, and picture complexity’, Proceedings of IEE Colloquim on Digitally compressed TY by satellite,

APTEKER, R.T., FISHER, J.A., KISlMOy W., andNEISHLOS, H.: ‘Video acceptability and frame rate’, IEEE Multimedia, 1995, 2, (3), pp. 32-40 ITU-T Methods for objective subjective assessment of quality, recommendation P.910, Subjective video Quality Assessment Methods for Multimedia Applications, 1996 CHIARIGLIONE, L.: ‘The development of an integrated audio-visual coding standard: MPEG’, Proc. IEEE, 1995, 83, (2), pp. 151-157 EBU Technical Recommendation R 37-1 986, The relative timing of the sound and vision components of a television signal 1986 Brussels ITU-R Recommendation 500-3, Method for the subjective assessment of the quality of television pictures 1986 CCIR Recommendation 562, Subjective assessment of sound quality Documents of the 13th Plenary Assembly, 1974, Vol. 11 ITU-T Methods for objective subjective assessment of quality, recommendation P.920, Interactive test methods for audio-visual communications, 1996 OEHLERT, G. W., et al., ‘Macanova statistical analysis software -User manual’ version 4.04, University of Minnesota, http://www.stat.umn. edd - earv/macanova/macanova.home.html

1996, pp. 259-270

Mag., 1996, 34, (8), pp. 118-126

1995, 1211-7

PAN, 6: ‘k tutorial on MPEG/audio compression’, IEEE Multimedia, 1995, 2, (2), pp. 60-74

444 IEE Proc.-Vis. Image Signal Process., Vol. 147, No. 5, October 2000

Date post:	21-Sep-2016
Category:	Documents
Upload:	lf
View:	213 times
Download:	1 times