+ All Categories
Home > Science > Selective encoding for abstractive sentence summarization

Selective encoding for abstractive sentence summarization

Date post: 21-Jan-2018
Category:
Upload: kodaira-tomonori
View: 54 times
Download: 0 times
Share this document with a friend
20
Selective Encoding for Abstractive Sentence Summarization Oingyu Zhou, Nan Yang, Furu Wei and Ming Zhou ACL 2017 Presentator: Kodaira Tomonori
Transcript
Page 1: Selective encoding for abstractive sentence summarization

Selective Encoding for Abstractive Sentence

Summarization Oingyu Zhou, Nan Yang, Furu Wei and Ming Zhou

ACL 2017

Presentator: Kodaira Tomonori

Page 2: Selective encoding for abstractive sentence summarization

TaskTask: Abstractive sentence summarization

Input: sentence Output: sentence

Figure 1.

Page 3: Selective encoding for abstractive sentence summarization

Introduction

• This task different from MT:1. there is no explicit alignment relationship.2. this task needs to keep the highlights and remove the unnecessary information.

Page 4: Selective encoding for abstractive sentence summarization

Improvement

• Problem: Previous framework is no explicit alignment relationship between the input sentence and the summary except for extracted common words.

• Solution:There method is not to infer the alignment, but to select the highlights while filtering out secondary information in the input.

Page 5: Selective encoding for abstractive sentence summarization

Problem Formulation

• Input sentence x = (x1, x2, …, xn) { xi ∈ Vs (=source vocab)}

• Output y = (y1, y2, … , yl) { l <= n}

Page 6: Selective encoding for abstractive sentence summarization

Model

Page 7: Selective encoding for abstractive sentence summarization

Sentence Encoder

• bidirectional GRU

• The initial states are set to zero vectors.

• After reading the sentence, hidden states are concatenated.

Page 8: Selective encoding for abstractive sentence summarization

Selective Mechanism

• sentence representation: s = [hbackward,1, hforward, n]

• sGatei = σ(Wshi + Uss + b)

• h’i = hi ○ sGatei

Page 9: Selective encoding for abstractive sentence summarization

Summary Decoder• GRU:

st = GRU(wt-1, ct-1, st-1)s0 = tanh(Wdhbackward,1 + b)

• Atention: et,i = vaTtanh(Wast-1 + Uah’i)at,i = exp(et,i) / ∑ni=1exp(et,i)ct = ∑ni=1at,ih’i

• Predict:rt = Wrwt-1 + Urct + Vrstmt =[max{rt,2j-1, rt,2j}]Tj=1,…,dp(yt|y1, …, yt-1) = softmax(Womt)

• w: word embeddingc: context vectors: hidden state

• h’i: encoder state

• rt: readout state

Page 10: Selective encoding for abstractive sentence summarization

Objective Function

• J(θ) = - (1 / |D|) ∑(x,y) ∈D log p (y|x)(D: a set of parallel sentence-summary pair)

• optimizer: Stochastic Gradient Descent

Page 11: Selective encoding for abstractive sentence summarization

Dataset

• Training set:English Gigaword dataset (Napoles et al., 2012)Training: 3.8M sentence-summary pairsDevelop: 189K

• Test set:1. English Gigaword 2. DUC 20043. MSR Abstractive Text Compression test sets

Page 12: Selective encoding for abstractive sentence summarization

Data statics

Table 2

Page 13: Selective encoding for abstractive sentence summarization

Evaluation Metric

• ROUGE (Lin, 2004)ROUGE-1, ROUGE-2, ROUGE-L

Page 14: Selective encoding for abstractive sentence summarization

Implementation Details• Parameters:

Embedding size: 300GRU hidden state sizes to 512dropout(Srivastava et al., 2014) [p = 0.5]

• Training:Adam: (α = .001, β1 = .9, β2 = .999)gradient clipping [-5, 5]

• BeamSearchbeamsize 12

Page 15: Selective encoding for abstractive sentence summarization

Baselines• ABS(Rush et al., 2015)

• ABS+ (Rush et al., 2015)

• CAs2s (Chopra et al., 2016)

• Feats2s (Nallapati et al., 2016)

• Luong-NMT (Luong et al., 2015)

• s2s+attthey also implement a s2s model with attention

Page 16: Selective encoding for abstractive sentence summarization

English Gigaword

Table 3

Page 17: Selective encoding for abstractive sentence summarization

DUC 2004

Table 4

Page 18: Selective encoding for abstractive sentence summarization

MSR-ATC

Figure 5

Page 19: Selective encoding for abstractive sentence summarization

Saliency Heat Map of Selective Gate

• they use the method in Li et al., 2016 to visualize the ocntribution of the selective gate to the final output.

• They approximate the Sy(g) by computing the first order Taylor expansion.

• THey draw the Euclidean norm of the first derivative of the output y with respect to the selective gate g associated with each input words.

Figure 3

Page 20: Selective encoding for abstractive sentence summarization

Conclusion

• propose a selective encoding model.

• greatly improves


Recommended