+ All Categories
Home > Documents > Artificial) Neural Networks

Artificial) Neural Networks

Date post: 03-Apr-2022
Category:
Upload: others
View: 15 times
Download: 0 times
Share this document with a friend
39
(Artificial) Neural Networks Details and Examples Ashkan Yousefpour September, 2018 Computer Science University of Texas at Dallas CS7301-003 Fall 2018
Transcript
Page 1: Artificial) Neural Networks

(Artificial) Neural NetworksDetails and Examples

Ashkan Yousefpour September, 2018

Computer ScienceUniversity of Texas at Dallas

CS7301-003 Fall 2018

Page 2: Artificial) Neural Networks

Outline• Introduction

• Perceptron

• Activation Functions

• Exercise

• Training Rule

• Gradient Descent

• Exercise

• Artificial Neural networks

• Different Types

• Exercises

• Back propagation

• Exercise

!2

Page 3: Artificial) Neural Networks

Introduction• Artificial Neural Networks (ANNs) provide interesting alternatives of

solving variety of problems in different fields of science and engineering

• Human brain

• Ultimate goal of a computer scientist is to create a computer that could mimic human brain (e.g. biological neural network)

• ANNs are simplifications of Biological Neural Networks

• ANNs have proven their applicability and importance by solving complex problems (e.g. emergence of deep neural networks, “deep learning”)

!3

Page 4: Artificial) Neural Networks

Motivation for this Lecture

!4

By the end of this lecture, we will be able to solve some concrete exercises like this one

Exercise 1

Page 5: Artificial) Neural Networks

ANN Building Block• The main component of ANN is perceptron

• ANN is a combination of many perceptrons, connected in a bigger network

• Perceptron with step activation function

Picture borrowed from https://www.hlt.utdallas.edu/~vgogate/ml/2018s/lectures/Perceptrons.pdf!5

Page 6: Artificial) Neural Networks

Perceptron• Usually in ANN, the linear unit (sum) and activation unit

are shown in one circle

Picture borrowed from http://aima.eecs.berkeley.edu/slides-pdf/chapter20b.pdf!6

Page 7: Artificial) Neural Networks

Perceptron Example

• Spam Detection

• 3 features (frequency of words “money”, “lottery”, and bias)

• spam is “positive” class

• Current weights

• Email is “win lottery money” -> spam

(w0, w1, w2) = (−3,4,2)

W . X = (1)(−3) + (1)(6) + (1)(2) = 5 > 0

!7

Spam!

Page 8: Artificial) Neural Networks

Perceptron Activation Functions

• Activation functions:

• Identity function

• Step function

• Sigmoid function (aka “logistic”)

• ReLU function

• See https://en.wikipedia.org/wiki/Activation_function

!8

Page 9: Artificial) Neural Networks

Perceptron implementable Functions

• Exercise: Implement NOT, AND, and OR using perceptron

• Linear functions can be implemented with perceptron (e.g. AND)

• Decision surface of perceptron is hyperplane (line in 2D)

!9

Page 10: Artificial) Neural Networks

Perceptron Training• We found a perceptron for AND, OR, NOT

• How about bigger examples, e.g. optical network reconfiguration plan given 200 features?

• How can computer find the weights automatically?

!10

Page 11: Artificial) Neural Networks

Perceptron Training Rule• Training rule:

• is learning rate (constant, e.g. 0.1)

• is the output of perceptron, including activation function

• is target value (desired)

wi ← wi + Δwi

Δwi = η(t − o)xi

η

o

t

!11

Page 12: Artificial) Neural Networks

Perceptron Training Rule• Perceptron training rule is great

• However, what happens if data is not linearly separable

• Goes back and forth

• Will not converge!

• Need another training rule

• Gradient descent or gradient ascent

!12

Page 13: Artificial) Neural Networks

Gradient Descent

• Gradient descent

• Let’s think about error (or loss) function

• Can we somehow get to the minima?

• Yes. Using gradientl[W] = E[W] =

12 ∑

d∈D

(td − od)2▽ l[W ]

l(W )

!13

Page 14: Artificial) Neural Networks

Gradient Descent• Gradient descent

• Error function E(W)

• Start randomly from somewhere (in the E(W) surface)

• Move downwards using gradient (will see soon)

• Hopefully you get to global minima

• Why not always?

!14Picture borrowed from https://www.hlt.utdallas.edu/~vgogate/ml/2018s/lectures/Perceptrons.pdf

Page 15: Artificial) Neural Networks

Perceptron Gradient Descent

• Error function E(W)

• is set of examples (i.e. data)

• Gradient

• Training rule

• Gradient descent

E[W] =12 ∑

d∈D

(td − od)2

D

▽ E[W] = [ dEdw0

,dEdw1

, . . . ,dEdwn

]

Δwi = − ηdEdwi

!15

Page 16: Artificial) Neural Networks

Perceptron Gradient Descent

• Exercise: Derive gradient descents for

• Activation: identity

• Activation: sigmoid

dEdwi

= ∑d

(td − od)(−xi,d)

dEdwi

= ∑d

(td − od)od(1 − od)(−xi,d)

!16

Page 17: Artificial) Neural Networks

Perceptron Gradient Descent1. Initialize each to some small random value

2. Until convergence do

1. Initialize each to zero

2. for each example in training data do

1. input the example and compute output

2. for each linear unit weight do

3. for each linear unit weight do

wi

Δwi

x

wi

Δwi ← Δwi + ηdEdwi

wi

wi ← wi + Δwi

o

Δwi ← Δwi + η(t − o)xiΔwi ← Δwi + η(t − o)o(1 − o)xi

{ or

!17

Page 18: Artificial) Neural Networks

Neural NetworksNeural Network: Connect perceptron (neurons) and make bigger structures

1. Feed-forward NN (ANN)

2. Recurrent Neural Network (RNN)

3. Convolutional Neural Networks (CNN)

Key learning algorithm: Back Propagation (BP)

A recent work: Dosovitskiy, Alexey, et al. "Flownet: Learning optical flow with convolutional networks." In Proceedings of the IEEE International Conference on Computer Vision, pp. 2758-2766. 2015.

!18

Page 19: Artificial) Neural Networks

ANN1. Feed-forward NN (ANN): one-direction, fully-connected

1. Single-layer perceptron

2. Multi-layer perceptron (MLP)

3. Deep Neural Network (DNN)

Picture borrowed from https://people.cs.pitt.edu/~xianeizhang/notes/NN/NN.html!19

Page 20: Artificial) Neural Networks

RNN2- Recurrent Neural Network (RNN)

• Directed cycles and delays

• Recognize pattern in time

Picture borrowed from http://cseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/TrackingRNN.pdf!20

Page 21: Artificial) Neural Networks

CNN3- Convolutional Neural Networks (CNN)

• Not fully-connected. Connected in convolutions style

• Recognize pattern in space

Picture borrowed from http://www.stat.ucla.edu/~xianjie.chen/projects/pose_estimation/pose_estimation.html!21

Page 22: Artificial) Neural Networks

ExerciseNow let’s look at some examples

�22

These examples are borrowed from Dr. Vibhav Gogate’s Machine Learning class. (Fall 2014 Midterm and Spring 2012 Final)

Page 23: Artificial) Neural Networks

Exercise 1

!23

Page 24: Artificial) Neural Networks

Exercise 1 Solution

!24

Page 25: Artificial) Neural Networks

Exercise 1 Solution

!25

Page 26: Artificial) Neural Networks

Exercise 1 Solution

!26

Page 27: Artificial) Neural Networks

Exercise 2

!27

Page 28: Artificial) Neural Networks

Exercise 2 Solution

!28

Page 29: Artificial) Neural Networks

Exercise 2 Solution

!29

Page 30: Artificial) Neural Networks

Back Propagation 1. Initialize all weights to some small random value

2. Until convergence do

1. for each example in training data do

1. input the example and compute output

2. for each unit do

3. for each hidden unit do

5. Update each network weights

wi,j ← wi,j + Δwi,j

x

kδk ← ok(1 − ok)(tk − ok)

h

δh ← oh(1 − oh) ∑u∈next_layer

wh,uδu

Δwi,j = ηδjoi,j

{{ for sigmoid. change for other functions

!30

Page 31: Artificial) Neural Networks

Back Propagation in Action

!31

x1 x2 x3

2 3

1

V12 V22 V23 V33

V21 V31

V02 V03

V01

Page 32: Artificial) Neural Networks

Back Propagation in Action

!32

x1 x2 x3

2 3

1

V12 V22 V23 V33

V21 V31

V02 V03

V01

o2 = σ(V02 + V12x1 + V22x2) o3 = σ(V03 + V23x2 + V33x3)

o1 = σ(V01 + V21o2 + V31o3)

Page 33: Artificial) Neural Networks

Back Propagation in Action

!33

x1 x2 x3

2 3

1

V12 V22 V23 V33

V21 V31

V02 V03

V01

δ2 = o2(1 − o2)δ1V21δ3 = o3(1 − o3)δ1V31

δ1 = o1(1 − o1)(t − o1)

Page 34: Artificial) Neural Networks

Back Propagation in Action

!34

x1 x2 x3

2 3

1

V12 V22 V23 V33

V21 V31

V02 V03

V01

ΔV21 = η . δ1 . o2 ΔV31 = η . δ1 . o3

ΔV23 = η . δ3 . x2

ΔV33 = η . δ3 . x3

ΔV22 = η . δ2 . x2

ΔV12 = η . δ2 . x1

Page 35: Artificial) Neural Networks

Exercise 3

!35

Page 36: Artificial) Neural Networks

Exercise 3

!36

Page 37: Artificial) Neural Networks

Exercise 3 Solution

!37

Page 38: Artificial) Neural Networks

Exercise 3 Solution

!38

Page 39: Artificial) Neural Networks

Exercise 3 Solution

!39


Recommended