+ All Categories
Home > Documents > Support Vector Machines Part 2

Support Vector Machines Part 2

Date post: 24-Feb-2016
Category:
Upload: hollye
View: 39 times
Download: 0 times
Share this document with a friend
Description:
Support Vector Machines Part 2. Recap of SVM algorithm. Given training set S = {( x 1 , y 1 ), ( x 2 , y 2 ), ..., ( x m , y m ) | ( x i , y i )  n {+1, -1} Choose a cheap-to-compute kernel function k ( x , z ) - PowerPoint PPT Presentation
Popular Tags:
12
Support Vector Machines Part 2
Transcript
Page 1: Support Vector Machines Part 2

Support Vector MachinesPart 2

Page 2: Support Vector Machines Part 2

Recap of SVM algorithm

Given training set S = {(x1, y1), (x2, y2), ..., (xm, ym) | (xi, yi) n {+1, -1}

1. Choose a cheap-to-compute kernel function k(x,z)

2. Apply quadratic programming procedure (using the kernel function k) to find {i}and bias b, where i ≠ 0 only if xi is a support vector.

Page 3: Support Vector Machines Part 2

3. Now, given a new instance, x, find the classification of x by computing

Page 4: Support Vector Machines Part 2

Clarifications from last time

Page 5: Support Vector Machines Part 2

http://nlp.stanford.edu/IR-book/html/htmledition/img1260.png

Without changing the problem, we can rescale our data to set a = 1

Length of the margin

margin

Page 6: Support Vector Machines Part 2

w is perpendicular to decision boundary

Page 7: Support Vector Machines Part 2

More on Kernels• So far we’ve seen kernels that map instances in n to

instances in z where z > n.

• One way to create a kernel: Figure out appropriate feature space Φ(x), and find kernel function k which defines inner product on that space.

• More practically, we usually don’t know appropriate feature space Φ(x).

• What people do in practice is either:1. Use one of the “classic” kernels (e.g., polynomial), or2. Define their own function that is appropriate for their

task, and show that it qualifies as a kernel.

Page 8: Support Vector Machines Part 2

How to define your own kernel

• Given training data (x1, x2, ..., xn)

• Algorithm for SVM learning uses kernel matrix (also called Gram matrix):

• We can choose some function k, and compute the kernel matrix K using the training data.

• We just have to guarantee that our kernel defines an inner product on some feature space.

• Not as hard as it sounds.

Page 9: Support Vector Machines Part 2

What counts as a kernel?

• Mercer’s Theorem: If the kernel matrix K is “symmetric positive definite”, it defines a kernel on the training data, that is, it defines an inner product in some feature space.

• We don’t even have to know what that feature space is! It can have a huge number of dimensions.

Page 10: Support Vector Machines Part 2
Page 11: Support Vector Machines Part 2
Page 12: Support Vector Machines Part 2

In-class exercises

Note for part (c):


Recommended