PowerPoint Presentation€¦ · Title: PowerPoint Presentation Author: Rohan Gupta Created Date:...

End-To-End Memory Networks

Sainbayar Sukhbaatar, Arthur Szlam, Jason Wetson, Rob Fergus Dept. Of Computer Science

Courant Institute, NYU &

Facebook AI Research New York

Outline

• Motivation • Model • Experiments • Results • Conclusion

Motivation

• Make a model that can perform many computational steps to answer a question.

• Make a model that describes dependencies in sequential data.

• I.E. sequential reasoning • Lightweight & easily trainable

Motivation over MemNN

• End-To-End Trainable • Far less supervision • More generalizable

Overview of Model

• Variables: – Discrete set of inputs (𝑥𝑖) – A query (q) – Produce an answer (a)

• Static Memory Bank • Multiple Hops

𝑝𝑖 = Softmax 𝑢𝑇𝑚𝑖 𝑜 = 𝑝𝑖𝑐𝑖𝑖

𝑎 = Softmax(𝑊 𝑜 + 𝑢 )

Weight Tying

• Adjacent: – 𝐴𝑘+1= 𝐶𝑘 – 𝑊𝑇 = 𝐶𝐾 – B = 𝐴1

• Layer-wise (RNN-like): – 𝐴1 = … = 𝐴𝑘, 𝐶1= … = 𝐶𝑘 – 𝑢𝑘+1= 𝐻𝑢𝑘 + 𝑜𝑘

Sentence Representation

• Bag-of-words – 𝑚𝑖 = 𝐴𝑥𝑖𝑗𝑗

• Position Encoding (PE) – 𝑚𝑖 = 𝑙𝑗 ∙ 𝐴𝑥𝑖𝑗𝑗

• Temporal Encoding (TE) – 𝑚𝑖 = 𝐴𝑥𝑖𝑗𝑗 + 𝑇𝐴(𝑖)

Synthetic QA Experiments

Similarity to Attention

NOTE: This model does not use the “support” label during training

Results

Language Modeling

Conclusion

• Outperforms all baselines with the same level of supervision (LSTMs etc.)

• Slightly worse than a strongly supervised Memory Network, but it was trained without supporting facts, so it can be easily trained in more general settings.

• On language modeling, outperforms RNNs and LSTMs

Date post:	03-Oct-2020
Category:	Documents
Upload:	others
View:	5 times
Download:	0 times

PowerPoint Presentation€¦ · Title: PowerPoint Presentation Author: Rohan Gupta Created Date:...

Documents