Page 1
Confidential + ProprietaryConfidential + Proprietary
Deep Learning for Language Understanding (at Google Scale)Anjuli KannanSoftware Engineer, Google Brain
Page 2
Confidential + Proprietary
Text is just a sequence of words
["hi", "team", "the", "server", "appears", "to", "be", "dropping", "about", "10%", …]
Page 3
About me
● My team: Google Brain○ "Make machines intelligent, improve people's lives."○ Research + software + applications○ g.co/brain
● My work is at boundary of research and applications● Focus on natural language understanding
Page 4
Neural network basics
Page 5
Confidential + Proprietary
Neural network
Is a 4
Is a 5
...
...
Image: Wikipedia
Page 6
Confidential + Proprietary
Neural network
Neuron
Is a 4
Is a 5
Page 7
Confidential + Proprietary
Basic building block is the neuron
Greg Corrado
Page 8
Gradient descent
w’ = w - α ∂wL(w)w
w’
Learning Rate
Slide: Vincent Vanhoucke
Page 9
Recurrent neural networks
Page 10
Confidential + Proprietary
Recurrent neural networks can model sequences
Page 11
Recurrent neural networks can model sequences
How
Message
Page 12
How are
Message
Recurrent neural networks can model sequences
Page 13
How are you
Message
Recurrent neural networks can model sequences
Page 14
How are you ?
Message
Recurrent neural networks can model sequences
Page 15
Internal state is a fixed length encoding of the message
How are you ?
Message
Recurrent neural networks can model sequences
Page 16
Sequence-to-sequence models
Page 17
Suppose we want to generate email replies
SmartreplyIncoming email
Response email
Page 18
Sequence-to-sequence model
Sutskever et al, NIPS 2014
Page 19
Sequence-to-sequence model
encoder decoder
Page 20
Sequence-to-sequence model
Ingests incoming message Generates reply message
Page 21
Encoder ingests the incoming message
Internal state is a fixed length encoding of the message
How are you ?
Message
Page 22
Decoder is initialized with final state of encoder
How are you ? __ How are you ?
Message
Page 23
Decoder is initialized with final state of encoder
How are you ? __ How are you ?
Message
Page 24
How are you ? __
I
Message
Response
Decoder predicts next word
Page 25
How are you ? __ I
I am
Message
Response
Decoder predicts next word
Page 26
How are you ? __ I am
I am great
Message
Response
Decoder predicts next word
Page 27
How are you ? __ I am great
I am great !
Message
Response
Vinyals & Le, ICML DL 2015Kannan et al, KDD 2016
Decoder predicts next word
Page 28
What the model can do
Page 29
What the model can do
Page 30
Summary
- Neural networks learn feature representations from raw data- Recurrent neural networks have statefulness which allows them to model
sequences of data such as text- The sequence-to-sequence model contains two recurrent neural networks: one
to encode an input sequence and one to generate an output sequence
Page 33
Research: Speech recognition
Page 34
Research: Electronic health records
Page 36
Resources
- All tensorflow tutorials: https://www.tensorflow.org/versions/master/tutorials/index.html
- Sequence-to-sequence tutorial (machine translation): https://www.tensorflow.org/versions/master/tutorials/seq2seq
- Chris Olah's blog: http://colah.github.io/