CS 378 lecture 16
TodayINNS- LSTMS ( the type of RNN you
will be using)- Implementation
IYfYIh Lm : Plñ ) or
- Midterm back soon Plwilw , ,.. -✓ i -1 )
" predict the"%Id "
Recipe RNNS + Language modelingId- dim-50
in."Q¥Q¥Q¥QÑ
☒ ¥ I ¥4 z:MYwpain:L.it
P (w I I saw the dog)= softnax (ZIÑ)
Training" Backpropagation through time
"
= backpropj 2-
① params
9."→ →ffplwl - - l"
.
.- pi
'
f f
(④ embeds
×, ×,
loss : - log Kuril :)
$µMultiple updates forVñi -i ÉÉi→w,yu ⇒ no problem
" V Elman network
longshort-termmemorynetworksk.tn#Many types of RNNS
ouptputsite☐→ next state
ininput
Lstms ( 1998)
short - term memory : what themodel
can remember in its state
☐→ - --- →☐
? does themotel
" remember" I
,
?
r
Ii LONG short-term memory( remember for longer)
Problem w/ Elman networks
vanishing /exploding gradients
☐→☐-☐→Is h-i-tnnhlwx-i-vhi.itq f I
I, I I I,= tanh (WI, +V.
tanh (WI, +V.tanh ( WI , -1J )) )
Assume tanh is the identity for
Ñ>= WI , + VWI
,+ VZWI
,
after n steps ⇒ V"→
I,
LSTMgatesing.FIElmmn:ñi=tmh(wIitVh
Gated : Ii = Ii - , ① f- + function ( Ii,ñi -c) É1
prevstate↳ lenientwise ✗
f- : forget gate , values in [0,1 ]
Ii -1 Do ☒Baµ=☒¥If f- =L :Tri
- ,is
totallyf- preserved
(added for
Where do f,i come from ?
pdletercise)bias6
f- sigmoid (wÉi+wHhi - i' b-forget)i = sigmoid (w
")Ii- W'"
thi- itbinput)
s%É,e÷.#
☐"
0+0*0Ti -1 I ? Ii update
Chris Olah 's blog hidden← state
forgetI
a¥¥
foutputgate
LSTM : 8 weight matrices
hidden state I
cell state c- ] tuple of theLSTM state
Ii
0=0:Ei
Poll :
discussed Istm - lecture .py .
I
4-pith I [1/1,2][0,1,2 ] ← [0,43]I