+ All Categories
Home > Documents > National Taiwan University, Taiwan - 國立臺灣大學yvchen/doc/SLT10_KeyTerm_slide.pdf · Key...

National Taiwan University, Taiwan - 國立臺灣大學yvchen/doc/SLT10_KeyTerm_slide.pdf · Key...

Date post: 17-Jun-2020
Category:
Upload: others
View: 11 times
Download: 0 times
Share this document with a friend
50
Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan
Transcript

Yun-Nung (Vivian) Chen, Yu Huang,

Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan

2

Key Term Extraction, National Taiwan University

Definition

•Key Term • Higher term frequency

• Core content

•Two types • Keyword

• Key phrase

•Advantage • Indexing and retrieval

• The relations between key terms and segments of documents

3

Key Term Extraction, National Taiwan University

Introduction

4

Key Term Extraction, National Taiwan University

Introduction

5

Key Term Extraction, National Taiwan University

acoustic model

language model

hmm n gram

phone hidden Markov model

Introduction

6

Key Term Extraction, National Taiwan University

hmm

acoustic model

language model

n gram

phone hidden Markov model

bigram

Target: extract key terms from course lectures

7

Key Term Extraction, National Taiwan University

Automatic Key Term Extraction

8

Key Term Extraction, National Taiwan University

▼ Original spoken documents

Archive of spoken

documents

Branching Entropy

Feature Extraction

Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network

ASR

speech signal

ASR trans

Automatic Key Term Extraction

9

Key Term Extraction, National Taiwan University

Archive of spoken

documents

Branching Entropy

Feature Extraction

Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network

ASR

speech signal

ASR trans

Automatic Key Term Extraction

10

Key Term Extraction, National Taiwan University

Archive of spoken

documents

Branching Entropy

Feature Extraction

Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network

ASR

speech signal

ASR trans

Phrase Identification

Automatic Key Term Extraction

11

Key Term Extraction, National Taiwan University

Archive of spoken

documents

Branching Entropy

Feature Extraction

Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network

ASR

speech signal

First using branching entropy to identify phrases

ASR trans

Phrase Identification

Key Term Extraction

Automatic Key Term Extraction

12

Key Term Extraction, National Taiwan University

Archive of spoken

documents

Branching Entropy

Feature Extraction

Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network

ASR

speech signal

Key terms

entropy acoustic model

:

Then using learning methods to extract key terms by some features

ASR trans

Phrase Identification

Key Term Extraction

Automatic Key Term Extraction

13

Key Term Extraction, National Taiwan University

Archive of spoken

documents

Branching Entropy

Feature Extraction

Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network

ASR

speech signal

Key terms

entropy acoustic model

:

ASR trans

Branching Entropy

14

Key Term Extraction, National Taiwan University

• “hidden” is almost always followed by the same word

hidden Markov model

How to decide the boundary of a phrase?

represent

is

can

:

:

is

of

in

:

:

Branching Entropy

15

Key Term Extraction, National Taiwan University

• “hidden” is almost always followed by the same word

• “hidden Markov” is almost always followed by the same word

hidden Markov model

How to decide the boundary of a phrase?

represent

is

can

:

:

is

of

in

:

:

Branching Entropy

16

Key Term Extraction, National Taiwan University

hidden Markov model

boundary

Define branching entropy to decide possible boundary

How to decide the boundary of a phrase?

represent

is

can

:

:

is

of

in

:

:

• “hidden” is almost always followed by the same word

• “hidden Markov” is almost always followed by the same word

• “hidden Markov model” is followed by many different words

Branching Entropy

17

Key Term Extraction, National Taiwan University

hidden Markov model

• Definition of Right Branching Entropy

• Probability of children xi for X

• Right branching entropy for X

X xi

How to decide the boundary of a phrase?

represent

is

can

:

:

is

of

in

:

:

Branching Entropy

18

Key Term Extraction, National Taiwan University

hidden Markov model

• Decision of Right Boundary

• Find the right boundary located between X and xi where

X

boundary

How to decide the boundary of a phrase?

represent

is

can

:

:

is

of

in

:

:

Branching Entropy

19

Key Term Extraction, National Taiwan University

hidden Markov model

How to decide the boundary of a phrase?

represent

is

can

:

:

is

of

in

:

:

Branching Entropy

20

Key Term Extraction, National Taiwan University

hidden Markov model

How to decide the boundary of a phrase?

represent

is

can

:

:

is

of

in

:

:

Branching Entropy

21

Key Term Extraction, National Taiwan University

hidden Markov model

How to decide the boundary of a phrase?

represent

is

can

:

:

is

of

in

:

:

Branching Entropy

22

Key Term Extraction, National Taiwan University

hidden Markov model

• Decision of Left Boundary

• Find the left boundary located between X and xi where

X: model Markov hidden

How to decide the boundary of a phrase?

boundary X

represent

is

can

:

:

is

of

in

:

:

Using PAT Tree to implement

Branching Entropy

23

Key Term Extraction, National Taiwan University

• Implementation in the PAT tree

• Probability of children xi for X

• Right branching entropy for X

hidden

Markov

1 model

2

chain

3

state

5 distribution

6

variable

4

X

x1

x2

X : hidden Markov x1: hidden Markov model x2: hidden Markov chain

How to decide the boundary of a phrase?

Phrase Identification

Key Term Extraction

Automatic Key Term Extraction

24

Key Term Extraction, National Taiwan University

Archive of spoken

documents

Branching Entropy

Feature Extraction

Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network

ASR

speech signal

Key terms

entropy acoustic model

:

Extract prosodic, lexical, and semantic features for each candidate term

ASR trans

Feature Extraction

25

Key Term Extraction, National Taiwan University

•Prosodic features

• For each candidate term appearing at the first time

Feature Name

Feature Description

Duration (I – IV)

normalized duration (max, min, mean, range)

Speaker tends to use longer duration to emphasize key terms

using 4 values for duration of the term

duration of phone “a” normalized by avg duration of phone “a”

Feature Extraction

26

Key Term Extraction, National Taiwan University

•Prosodic features

• For each candidate term appearing at the first time

Higher pitch may represent significant information

Feature Name

Feature Description

Duration (I – IV)

normalized duration (max, min, mean, range)

Feature Extraction

27

Key Term Extraction, National Taiwan University

•Prosodic features

• For each candidate term appearing at the first time

Higher pitch may represent significant information

Feature Name

Feature Description

Duration (I – IV)

normalized duration (max, min, mean, range)

Pitch (I - IV)

F0 (max, min, mean, range)

Feature Extraction

28

Key Term Extraction, National Taiwan University

•Prosodic features

• For each candidate term appearing at the first time

Higher energy emphasizes important information

Feature Name

Feature Description

Duration (I – IV)

normalized duration (max, min, mean, range)

Pitch (I - IV)

F0 (max, min, mean, range)

Feature Extraction

29

Key Term Extraction, National Taiwan University

•Prosodic features

• For each candidate term appearing at the first time

Higher energy emphasizes important information

Feature Name

Feature Description

Duration (I – IV)

normalized duration (max, min, mean, range)

Pitch (I - IV)

F0 (max, min, mean, range)

Energy (I - IV)

energy (max, min, mean, range)

Feature Extraction

30

Key Term Extraction, National Taiwan University

• Lexical features

Feature Name Feature Description

TF term frequency

IDF inverse document frequency

TFIDF tf * idf

PoS the PoS tag

Using some well-known lexical features for each candidate term

Feature Extraction

31

Key Term Extraction, National Taiwan University

•Semantic features • Probabilistic Latent Semantic Analysis (PLSA)

Latent Topic Probability

Key terms tend to focus on limited topics

t1

t2

t j

tn

D1

D2

Di

DN

TK

Tk

T2

T1

P(T |D )k iP(t |T )j k

Di: documents Tk: latent topics tj: terms

Feature Extraction

32

Key Term Extraction, National Taiwan University

•Semantic features • Probabilistic Latent Semantic Analysis (PLSA)

Latent Topic Probability

Feature Name Feature Description

LTP (I - III) Latent Topic Probability (mean, variance, standard deviation)

non-key term

key term

Key terms tend to focus on limited topics

describe a probability distribution

How to use it?

Feature Extraction

33

Key Term Extraction, National Taiwan University

•Semantic features • Probabilistic Latent Semantic Analysis (PLSA)

Latent Topic Significance

Within-topic to out-of-topic ratio

Feature Name Feature Description

LTP (I - III) Latent Topic Probability (mean, variance, standard deviation)

non-key term

key term

Key terms tend to focus on limited topics

within-topic freq.

out-of-topic freq.

Feature Extraction

34

Key Term Extraction, National Taiwan University

•Semantic features • Probabilistic Latent Semantic Analysis (PLSA)

Latent Topic Significance

Within-topic to out-of-topic ratio

Feature Name Feature Description

LTP (I - III) Latent Topic Probability (mean, variance, standard deviation)

LTS (I - III) Latent Topic Significance (mean, variance, standard deviation)

non-key term

key term

Key terms tend to focus on limited topics

within-topic freq.

out-of-topic freq.

Feature Extraction

35

Key Term Extraction, National Taiwan University

•Semantic features • Probabilistic Latent Semantic Analysis (PLSA)

Latent Topic Entropy

Feature Name Feature Description

LTP (I - III) Latent Topic Probability (mean, variance, standard deviation)

LTS (I - III) Latent Topic Significance (mean, variance, standard deviation)

non-key term

key term

Key terms tend to focus on limited topics

Feature Extraction

36

Key Term Extraction, National Taiwan University

•Semantic features • Probabilistic Latent Semantic Analysis (PLSA)

Latent Topic Entropy

Feature Name Feature Description

LTP (I - III) Latent Topic Probability (mean, variance, standard deviation)

LTS (I - III) Latent Topic Significance (mean, variance, standard deviation)

LTE term entropy for latent topic

non-key term

key term

Key terms tend to focus on limited topics

Higher LTE

Lower LTE

Phrase Identification

Key Term Extraction

Automatic Key Term Extraction

37

Key Term Extraction, National Taiwan University

Archive of spoken

documents

Branching Entropy

Feature Extraction

Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network

ASR

speech signal

ASR trans

Key terms

entropy acoustic model

:

Using unsupervised and supervised approaches to extract key terms

Learning Methods

38

Key Term Extraction, National Taiwan University

•Unsupervised learning • K-means Exemplar

Transform a term into a vector in LTS (Latent Topic Significance) space

Run K-means

Find the centroid of each cluster to be the key term

The candidate term in the same group are related to the key term The key term can represent this topic

The terms in the same cluster focus on a single topic

Learning Methods

39

Key Term Extraction, National Taiwan University

•Supervised learning • Adaptive Boosting

• Neural Network

Automatically adjust the weights of features to produce a classifier

40

Key Term Extraction, National Taiwan University

Experiments

41

Key Term Extraction, National Taiwan University

•Corpus • NTU lecture corpus

Mandarin Chinese embedded by English words

Single speaker

45.2 hours

我們的solution是viterbi algorithm

(Our solution is viterbi algorithm)

Experiments

42

Key Term Extraction, National Taiwan University

•ASR Accuracy

Language Mandarin English Overall

Char Acc (%) 78.15 53.44 76.26

CH EN

SI Model

some data from target speaker

AM

Out-of-domain corpora

Background

In-domain corpus

Adaptive

trigram interpolation LM

Bilingual AM and model adaptation

Experiments

43

Key Term Extraction, National Taiwan University

•Reference Key Terms • Annotations from 61 students who have taken the course

If the k-th annotator labeled Nk key terms, he gave each of them a score of , but 0 to others

Rank the terms by the sum of all scores given by all annotators for each term

Choose the top N terms form the list (N is average Nk)

• N = 154 key terms

59 key phrases and 95 keywords

Experiments

44

Key Term Extraction, National Taiwan University

•Evaluation • Unsupervised learning

Set the number of key terms to be N

• Supervised learning

3-fold cross validation

0

10

20

30

40

50

60

Pr Lx Sm Pr+Lx Pr+Lx+Sm

Experiments

45

Key Term Extraction, National Taiwan University

•Feature Effectiveness • Neural network for keywords from ASR transcriptions

Each set of these features alone gives F1 from 20% to 42% Prosodic features and lexical features are additive Three sets of features are all useful

20.78

42.86

35.63

48.15

56.55

Pr: Prosodic Lx: Lexical Sm: Semantic

F-measure

0

10

20

30

40

50

60

70

Baseline U: TFIDF U: K-means S: AB S: NN

manual

Experiments

46

Key Term Extraction, National Taiwan University

•Overall Performance

51.95

55.84

62.39

67.31

23.38

Conventional TFIDF scores w/o branching entropy stop word removal PoS filtering

Branching entropy performs well K-means Exempler outperforms TFIDF Supervised approaches are better than unsupervised approaches

F-measure

AB: AdaBoost NN: Neural Network

0

10

20

30

40

50

60

70

Baseline U: TFIDF U: K-means S: AB S: NN

manual

ASR

Experiments

47

Key Term Extraction, National Taiwan University

•Overall Performance

The performance of ASR is slightly worse than manual but reasonable Supervised learning using neural network gives the best results

23.38 20.78

51.95

43.51

55.84 52.60

62.39

57.68

67.31

62.70

F-measure

AB: AdaBoost NN: Neural Network

48

Key Term Extraction, National Taiwan University

Conclusion

49

Key Term Extraction, National Taiwan University

•We propose the new approach to extract key terms

•The performance can be improved by • Identifying phrases by branching entropy

• Prosodic, lexical, and semantic features together

•The results are encouraging

Thank reviewers for valuable comments

NTU Virtual Instructor: http://speech.ee.ntu.edu.tw/~RA/lecture

50

Key Term Extraction, National Taiwan University


Recommended