+ All Categories
Home > Documents > User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Date post: 25-Dec-2015
Category:
Upload: lorena-cole
View: 218 times
Download: 0 times
Share this document with a friend
Popular Tags:
24
User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research
Transcript
Page 1: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

User Benefits of Non-Linear Time Compression

Liwei He and Anoop Gupta

Microsoft Research

Page 2: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Introduction

Time compression: key to browse AV content

We focus on informational content

Audio time compression algorithms

Linear: speed up audio uniformly

Non-linear: exploit fine-grain structure of human speech (e.g. pause, phonemes)

How much more do users gain from more complex algorithms?

Page 3: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Methodology

Conduct user listening test

One Linear TC algorithm

Two Non-linear TC algorithms

Simple: Pause-removal followed by Linear TC

Sophisticated: Adaptive TC

Compare objective and subjective measurements

Page 4: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Time Compression Algorithms

Page 5: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Linear Time Compression

Classic algorithms

Overlap Add (OLA) and Synchronized OLA (SOLA)

We use SOLA

Page 6: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Non-Linear Time Compression

Algorithm 1: Pause removal plus TC

Energy and Zero Crossing Rate analysis

Leave 150ms untouched

Shorten >150ms to 150ms

Apply SOLA algorithm

PR shortens speech by 10-25%

Page 7: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Non-Linear Time Compression (cont.)

Algorithm 2: Adaptive TC

Mimics people when talking fast

Pauses and silences are compressed the most

Stressed vowels are compressed the least

Consonants are compressed more than vowels

Consonants are compressed based on neighboring vowels

Page 8: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

System Implications

Computational complexity

Adaptive TC 10x more costly than Linear TC

Complexity in client-server implementation

Buffer management required for non-linear TC

Audio-video synchronization quality

Page 9: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

User Study Method

Page 10: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

User Study Goals

Highest intelligible speed

Comprehension

Subjective preference

Sustainable speed

Page 11: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Experiment Method

24 subjects

4 tasks for each subject

3 time compression algorithms

Linear TC using SOLA (Linear)

Pause removal plus Linear TC (PR-Lin)

Adaptive TC (Adapt)

Each test takes approximately 30 minutes

Page 12: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Highest Intelligible Speed Task

3 clips from technical talks

Find the highest speed when most of words are understandable

Page 13: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Comprehension Task

3 clips at 1.5x and 3 clips at 2.5x

Clips from TOEFL listening test

Answer 4 multiple choice questions

Page 14: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Subjective Preference Task

3 pairs of clips at 1.5x

3 pairs of clips at 2.5x

Each pair contains the same clip compressed with 2 of the 3 TC algorithms

Indicate preference on 3-point scale

Page 15: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Sustainable Speed Task

3 clips each 8 minute along

Clips from a CD audio book

Find the maximum comfortable speed

Write a 4-5 sentence summary at the end

Page 16: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

User Study Results

Page 17: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Highest Intelligible Speed Task

PR-Lin is significantly better than Adapt (p<.01)

0

0.5

1

1.5

2

2.5

3

Linear PR-Lin Adapt

Co

mp

res

sio

n R

ate

Page 18: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Comprehension Task

0

10

20

30

40

50

60

70

80

90

Linear PR-Lin Adapt

Sc

ore

(%

)

1.5x

2.5x

Adapt is better than PR-Lin (p=.083) at 2.5x

Page 19: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Preference Task at 1.5x

Slight preference for PR-Lin (p=.093)

1.5xPrefer Former

Prefer None

Prefer Latter

Linear vs. PR-Lin

6 5 13

PR-Lin vs. Adapt

13 5 6

Adapt vs. Linear

8 8 8

Page 20: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Preference Task at 2.5x

PR-Lin and Adapt do significantly better than Linear

2.5xPrefer Former

Prefer None

Prefer Latter

Linear vs. PR-Lin

2 8 14

PR-Lin vs. Adapt

4 9 11

Adapt vs. Linear

21 3 0

Page 21: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Sustainable Speed Task

0

0.5

1

1.5

2

2.5

Linear PR-Lin Adapt

Co

mp

res

sio

n R

ate

Page 22: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Conclusions

Page 23: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Previous Works

Mach1 (Covell et. al. ICASSP 98)

Comprehension and preference tasks

Comparing Linear and Mach1 (Adapt) at 2.6-4.2x

Comprehension scores 17% better w/ Mach1

95% prefers Mach1 to Linear

No data on < 2.0x

Other works (Harrigan, Omoigui, Li, Foulke)

1.2-1.7x is the sustainable listening speed

Page 24: User Benefits of Non-Linear Time Compression Liwei He and Anoop Gupta Microsoft Research.

Conclusions

Trade off in TC algorithms is task-related

Listening: Linear TC is sufficient

Fast Forwarding: Non-linear TC is more suitable

Adapt TC is close to the way people talk fast

Limit lies in the human-listening and comprehension


Recommended