+ All Categories
Home > Documents > Multimedia Quality in a Conversational Video-conferencing

Multimedia Quality in a Conversational Video-conferencing

Date post: 03-Feb-2022
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
33
ETSI workshop on Effects of transmission performance on Multimedia QoS 17 - 19 June 2008 - Prague, Czech Republic www.psytechnics.com Multimedia Quality in a Conversational Video-conferencing Environment Quan Huynh-Thu, Psytechnics Ltd [email protected]
Transcript

ETSI workshop on Effects of transmission performance on Multimedia QoS 17 - 19 June 2008 - Prague, Czech Republic

www.psytechnics.com

Multimedia Quality in a Conversational Video-conferencing Environment

Quan Huynh-Thu, Psytechnics Ltd

[email protected]

Presentation outline

• Introduction to video-conferencing

• Beyond video-conferencing

• HD video-conferencing subjective experiment• HD video-conferencing subjective experiment

• Results

• Discussion

© Psytechnics 20082

Video-conferencing: early hype

What?

• Real-time audio-visual communication

Why?

• Enhance collaboration • Enhance collaboration

Where?

• Desktops, laptops, mobile devices…

• Integration in Unified Communications environment

© Psytechnics 20083

Video-conferencing: challenges

Real-timeInteractive

MultimediaDelayMedia

synchronization

© Psytechnics 20084

Bandwidth limited

IP Application

Transmission errors and network

congestion

Coding artefacts

Video-conferencing: issues

• Technology:– Video compression (coding)

– IP-based transmission (errors)

– Real-time constraint (delay)

– Two or multi-party multimedia communication (handling of potential multiple end-points)of potential multiple end-points)

• Human factor:– 'Un-natural' conversational feeling due to small resolution and camera set-up: • “I’m looking at you... but can't see you”

– Higher sensitivity to errors than for media streaming

© Psytechnics 20085

Video-conferencing: user expectation

Usage Requirements

Public “Look…”, “Hi…” • Cheap• Simple usage

© Psytechnics 20086

Corporate/Enterprise • Collaboration• Increased productivity

• Face-to-face interaction

• Cost effective• Reliable• Quality

Beyond video-conferencing: High-definition

• Real-time audio-video communication

• Enhanced features for tele-collaboration

• Face-to-face feeling (tele-presence)• Face-to-face feeling (tele-presence)

© Psytechnics 20087

HD video-conferencing subjective experiment

• Real-time interactive/conversational experiment using high-definition video-conferencing system

• Trials: task-driven 2-min conversations

1. Task requiring focus on visual terminal

2. Task not especially requiring use of visual information

• Various bandwidth/network conditions

© Psytechnics 20088

Subjective testing facilities

State-of-the art subjective test rooms at Psytechnics headquarters

© Psytechnics 20089

Experiment set-up: equipment

• 24” widescreen full HD display (1080p)

• Hardware codec box with associated HD camera (720p@30fps)

• Professional shotgun microphone

• Pair of speakers

© Psytechnics 200810

Experiment: set-up

display

camera

microphone speakerspeaker

Viewing distance: 4H (80cm)

© Psytechnics 200811

Viewing distance: 4H (80cm)

Conversational task 1: Shape matching

• Purpose: maintain focus on visual terminal

• Description: work in partnership to build a 3D shape from construction blocks

– Subject A was given a bag of multicolour, interlocking construction blocks

– Subject B was given a completed figure made from an – Subject B was given a completed figure made from an identical set of blocks

– Subject B had to instruct subject A on how to build the identical shape

– Subjects had to verify the correctness of the built shape

© Psytechnics 200812

Conversational task 2: Who’s in the bag?

• Purpose: “free” conversation not especially requiring focus on visual terminal

• Description: identify as many famous characters as possible.

– The “Clue Giver” takes a card from the bag and has to provide clues as to who is on the card without naming provide clues as to who is on the card without naming the character

– The “Guesser” must name the character and can ask questions

– “Clue giver” can skip the name he/she does not know the personality

© Psytechnics 200813

Participants

• 20 naïve subjects: 8 females and 12 males

• 16 subjects recruited from public

• 4 subjects from Psytechnics

• Age: 18-72

• None had participated in subjective testing in the • None had participated in subjective testing in the past

• None had experienced a HD video-conferencing system in the past

© Psytechnics 200814

Experimental design

• Bandwidth: 3072 and 1472 kbps

• Packet loss ratio: 0, 0.5, 3 and 6%

• Task: 1 and 2

• No delay or jitter

• Video codec: H.264 baseline profile• Video codec: H.264 baseline profile

• Audio codec: AAC at 64kbps

• Network degradations generated using ITU-T G.1050 IP impairment model

• Full factorial design (3 variables): 16 test conditions

© Psytechnics 200815

Experimental procedure

• Test conditions grouped by task:

1. 8 test conditions with task 1

2. Break

3. 8 conditions with task 2

• Presentation order of test conditions was randomizedrandomized

• Test condition was identical for both parties in a given trial

• Role of “instruction giver” and “instruction follower” was swapped between test conditions

© Psytechnics 200816

Quality assessment ratings

1. How would you rate the audio quality of the connection?

2. How would you rate the video quality of the connection?

3. How would you rate the overall quality of the connection? connection?

4. Did you have any difficulty in understanding the other party during the connection?

5. Was the overall quality of the connection acceptable for the task?

© Psytechnics 200817

Quality assessment ratings

• 5-point discrete category rating scale for Qs 1-3:

Excellent

Good

Fair

Poor

• Binary answer for Qs 4-5:

© Psytechnics 200818

Poor

Bad

Yes No

Example: Task 1 with 3072 kbps, PLR=0%

19 © Psytechnics 2008

Example: Task 1 with 1472 kbps, PLR=6%

20 © Psytechnics 2008

Example: Task 2 with 1472 kbps, PLR=0%

© Psytechnics 200821

Example: Task 2 with 1472 kbps, PLR=6%

© Psytechnics 200822

Results: distribution of quality ratings

50

60

70

80

90

100

Pe

rce

ntag

e o

f vo

tes

audio

50

60

70

80

90

100

Pe

rce

ntag

e o

f vo

tes

video

60

70

80

90

100

Per

cen

tag

e o

f vot

es

multimedia

© Psytechnics 200823

1 2 3 4 50

10

20

30

40

50

Ratings

Pe

rce

ntag

e o

f vo

tes

1 2 3 4 50

10

20

30

40

50

Ratings

Pe

rce

ntag

e o

f vo

tes

1 2 3 4 50

10

20

30

40

50

Ratings

Per

cen

tag

e o

f vot

es

Results: audio quality

2

2.5

3

3.5

4

4.5

5

Aud

io M

OS

2.5

3

3.5

4

4.5

5

Au

dio

MO

S

© Psytechnics 200824

1

1.5

2

Condition

3Mbp

s,PLR

=0

1.4M

bps,P

LR=0

3Mbp

s,PLR

=0.5

1.4M

bps,P

LR=0

.53M

bps,P

LR=3

1.4M

bps,P

LR=3

3Mbp

s,PLR

=6

1.4M

bps,P

LR=6

task 1

task 2

1

1.5

2

Condition

1.4M

bps T

ask1

1.4M

bps T

ask2

3Mbp

s Tas

k13M

bps T

ask2

PLR=0%

PLR=0.5%PLR=3%

PLR=6%

Results: video quality

2

2.5

3

3.5

4

4.5

5

Vid

eo M

OS

2.5

3

3.5

4

4.5

5

Vid

eo M

OS

© Psytechnics 200825

1

1.5

2

Condition

3Mbp

s,PLR

=0

1.4M

bps,P

LR=0

3Mbp

s,PLR

=0.5

1.4M

bps,P

LR=0

.53M

bps,P

LR=3

1.4M

bps,P

LR=3

3Mbp

s,PLR

=6

1.4M

bps,P

LR=6

task 1

task 2

1

1.5

2

Condition

1.4M

bps T

ask1

1.4M

bps T

ask2

3Mbp

s Tas

k13M

bps T

ask2

PLR=0%

PLR=0.5%PLR=3%

PLR=6%

Results: multimedia quality

2

2.5

3

3.5

4

4.5

5

Mu

ltim

edia

MO

S

2.5

3

3.5

4

4.5

5

Mul

timed

ia M

OS

© Psytechnics 200826

1

1.5

2

Condition

3Mbp

s,PLR

=0

1.4M

bps,P

LR=0

3Mbp

s,PLR

=0.5

1.4M

bps,P

LR=0

.53M

bps,P

LR=3

1.4M

bps,P

LR=3

3Mbp

s,PLR

=6

1.4M

bps,P

LR=6

task 1

task 2

1

1.5

2

Condition

1.4M

bps T

ask1

1.4M

bps T

ask2

3Mbp

s Tas

k13M

bps T

ask2

Mul

timed

ia M

OS

PLR=0%

PLR=0.5%PLR=3%

PLR=6%

ANOVA: audio quality

Source Sum Sq. d.f. Mean Sq. F Prob>F NobsEffect

size

BitRate 4.05 1 4.05 7.832061 0.01145901 160 0.1591

PLR 23.575 3 7.858333 13.25222 0.00000113 80 0.3134

Task 0.3125 1 0.3125 0.429864 0.51991635 160 0.0442

BitRate*PLR 1.175 3 0.391667 1.210027 0.31437809 40 0.0990

BitRate*Task 0.1125 1 0.1125 0.448819 0.51095717 80 0.0375

© Psytechnics 200827

BitRate*Task 0.1125 1 0.1125 0.448819 0.51095717 80 0.0375

PLR*Task 3.3625 3 1.120833 3.755327 0.01567849 40 0.1674

BitRate*PLR*Task 0.9125 3 0.304167 1.264357 0.29525098 20 0.1233

ANOVA: video quality

Source Sum Sq. d.f. Mean Sq. F Prob>F NobsEffect

size

BitRate 4.05 1 4.05 9.529412 0.00606871 160 0.1591

PLR 110 3 36.66667 42.98201 0.00000000 80 0.6770

Task 0.0125 1 0.0125 0.027576 0.86986343 160 0.0088

BitRate*PLR 1.05 3 0.35 0.914089 0.43995935 40 0.0935

BitRate*Task 0.0125 1 0.0125 0.082969 0.77643152 80 0.0125

© Psytechnics 200828

BitRate*Task 0.0125 1 0.0125 0.082969 0.77643152 80 0.0125

PLR*Task 5.5375 3 1.845833 4.236034 0.00900974 40 0.2148

BitRate*PLR*Task 0.2375 3 0.079167 0.19716 0.89790687 20 0.0629

ANOVA: multimedia quality

Source Sum Sq. d.f. Mean Sq. F Prob>F NobsEffect

size

BitRate 4.5125 1 4.5125 19.6533 0.00028535 160 0.1679

PLR 48.775 3 16.25833 25.32036 0.00000000 80 0.4508

Task 0.2 1 0.2 0.968153 0.33750598 160 0.0354

BitRate*PLR 0.4125 3 0.1375 0.418838 0.74016348 40 0.0586

BitRate*Task 0.3125 1 0.3125 0.979381 0.33478807 80 0.0625

© Psytechnics 200829

BitRate*Task 0.3125 1 0.3125 0.979381 0.33478807 80 0.0625

PLR*Task 3.225 3 1.075 5.259657 0.00284023 40 0.1639

BitRate*PLR*Task 0.3125 3 0.104167 0.363985 0.77923977 20 0.0722

Intelligibility

Conditio

• Percentage of subjects (P1) who had no difficulty in understanding the other party during the connection

MOS=2.95

© Psytechnics 200830

Task 1

Conditio

n1 2 3 4 5 6 7 8

P1 (%) 100 100 95 95 95 90 95 95

Task 2

Conditio

n9 10 11 12 13 14 15 16

P1 (%) 95 95 90 90 95 95 100 95

Acceptability

• Percentage of subjects (P2) who found the quality of the connection acceptable for the required task

Conditio

MOS=2.95

© Psytechnics 200831

Task 1

Conditio

n1 2 3 4 5 6 7 8

P2 (%) 100 100 100 100 95 95 100 90

Task 2

Conditio

n9 10 11 12 13 14 15 16

P2 (%) 95 95 90 90 95 85 90 95

Summary

• Conversational video-conferencing experiment using – Full-factorial design based on 3 variables: bandwidth, packet loss ratio and task

– Random packet loss

– No delay or jitter

• Most participants unexpectedly provided high quality ratings and found the quality acceptable even if video was severely degraded (user expectation)

• The system in test produced similar video quality at both 3Mbps and 1.4Mbps (without packet loss)

• Packet loss ratio was found to be the most important factor influencing multimedia quality amongst the 3 variables considered.

• Statistical analysis showed an interaction effect between the visual impact of packet loss and task on multimedia quality

© Psytechnics 200832

Discussion

• User expectation: naïve participants (public) might have lower quality expectation than target users (experienced with video / business users)

• Balance/distribution of errors/qualities on the audio and visual signals using real-world systems in subjective experiments

• Suitable tasks to exercise both audio and video components, create eye-contact…eye-contact…

• Conversational tests represent heavy investment for relatively small amount of data:

– Full-factorial design with small number of variables to examine main/interaction effects

– Fractional design with high number of variables but no possibility to examine main/interaction effects

© Psytechnics 200833


Recommended