Today - cs.utexas.edu

CS 378 lecture 16

TodayINNS- LSTMS ( the type of RNN you

will be using)- Implementation

IYfYIh Lm : Plñ ) or

- Midterm back soon Plwilw , ,.. -✓ i -1 )

" predict the"%Id "

Recipe RNNS + Language modelingId- dim-50

in."Q¥Q¥Q¥QÑ

☒ ¥ I ¥4 z:MYwpain:L.it

P (w I I saw the dog)= softnax (ZIÑ)

Training" Backpropagation through time

"

= backpropj 2-

① params

9."→ →ffplwl - - l"

.

.- pi

'

f f

(④ embeds

×, ×,

loss : - log Kuril :)

$µMultiple updates forVñi -i ÉÉi→w,yu ⇒ no problem

" V Elman network

longshort-termmemorynetworksk.tn#Many types of RNNS

ouptputsite☐→ next state

ininput

Lstms ( 1998)

short - term memory : what themodel

can remember in its state

☐→ - --- →☐

? does themotel

" remember" I

,

?

r

Ii LONG short-term memory( remember for longer)

Problem w/ Elman networks

vanishing /exploding gradients

☐→☐-☐→Is h-i-tnnhlwx-i-vhi.itq f I

I, I I I,= tanh (WI, +V.

tanh (WI, +V.tanh ( WI , -1J )) )

Assume tanh is the identity for

Ñ>= WI , + VWI

,+ VZWI

,

after n steps ⇒ V"→

I,

LSTMgatesing.FIElmmn:ñi=tmh(wIitVh

Gated : Ii = Ii - , ① f- + function ( Ii,ñi -c) É1

prevstate↳ lenientwise ✗

f- : forget gate , values in [0,1 ]

Ii -1 Do ☒Baµ=☒¥If f- =L :Tri

- ,is

totallyf- preserved

(added for

Where do f,i come from ?

pdletercise)bias6

f- sigmoid (wÉi+wHhi - i' b-forget)i = sigmoid (w

")Ii- W'"

thi- itbinput)

s%É,e÷.#

☐"

0+0*0Ti -1 I ? Ii update

Chris Olah 's blog hidden← state

forgetI

a¥¥

foutputgate

LSTM : 8 weight matrices

hidden state I

cell state c- ] tuple of theLSTM state

Ii

0=0:Ei

Poll :

discussed Istm - lecture .py .

I

4-pith I [1/1,2][0,1,2 ] ← [0,43]I

Date post:	08-Jan-2022
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Today - cs.utexas.edu

Documents