+ All Categories
Home > Documents > Learning Structural SVMs with Latent Variables

Learning Structural SVMs with Latent Variables

Date post: 21-Feb-2016
Category:
Upload: laksha
View: 56 times
Download: 0 times
Share this document with a friend
Description:
Learning Structural SVMs with Latent Variables. Xionghao Liu. Annotation Mismatch. Action Classification. x. h. Input x. Annotation y. Latent h. y = “jumping”. Desired output during test time is y. Mismatch between desired and available annotations. - PowerPoint PPT Presentation
Popular Tags:
26
Learning Structural SVMs with Latent Variables Xionghao Liu
Transcript
Page 1: Learning Structural SVMs  with Latent Variables

Learning Structural SVMs with Latent Variables

Xionghao Liu

Page 2: Learning Structural SVMs  with Latent Variables

Annotation Mismatch

Input x

Annotation y

Latent h

x

y = “jumping”

h

Action Classification

Mismatch between desired and available annotations

Exact value of latent variable is not “important”

Desired output during test time is y

Page 3: Learning Structural SVMs  with Latent Variables

• Latent SVM

• Optimization

• Practice

• Extensions

Outline – Annotation Mismatch

Andrews et al., NIPS 2001; Smola et al., AISTATS 2005;Felzenszwalb et al., CVPR 2008; Yu and Joachims, ICML 2009

Page 4: Learning Structural SVMs  with Latent Variables

Weakly Supervised Data

Input x

Output y {-1,+1}

Hidden h

x

y = +1

h

Page 5: Learning Structural SVMs  with Latent Variables

Weakly Supervised Classification

Feature Φ(x,h)

Joint Feature Vector

Ψ(x,y,h)

x

y = +1

h

Page 6: Learning Structural SVMs  with Latent Variables

Weakly Supervised Classification

Feature Φ(x,h)

Joint Feature Vector

Ψ(x,+1,h) Φ(x,h)

0=

x

y = +1

h

Page 7: Learning Structural SVMs  with Latent Variables

Weakly Supervised Classification

Feature Φ(x,h)

Joint Feature Vector

Ψ(x,-1,h) 0

Φ(x,h)=

x

y = +1

h

Page 8: Learning Structural SVMs  with Latent Variables

Weakly Supervised Classification

Feature Φ(x,h)

Joint Feature Vector

Ψ(x,y,h)

Score f : Ψ(x,y,h) (-∞, +∞)

Optimize score over all possible y and h

x

y = +1

h

Page 9: Learning Structural SVMs  with Latent Variables

Scoring function

wTΨ(x,y,h)

Prediction

y(w),h(w) = argmaxy,h wTΨ(x,y,h)

Latent SVM

Parameters

Page 10: Learning Structural SVMs  with Latent Variables

Learning Latent SVM

(yi, yi(w))Σi

Empirical risk minimization

minw

No restriction on the loss function

Annotation mismatch

Training data {(xi,yi), i = 1,2,…,n}

Page 11: Learning Structural SVMs  with Latent Variables

Learning Latent SVM

(yi, yi(w))Σi

Empirical risk minimization

minw

Non-convex

Parameters cannot be regularized

Find a regularization-sensitive upper bound

Page 12: Learning Structural SVMs  with Latent Variables

Learning Latent SVM

- wT(xi,yi(w),hi(w))

(yi, yi(w))wT(xi,yi(w),hi(w)) +

Page 13: Learning Structural SVMs  with Latent Variables

Learning Latent SVM

(yi, yi(w))wT(xi,yi(w),hi(w)) +

- maxhi wT(xi,yi,hi)

y(w),h(w) = argmaxy,h wTΨ(x,y,h)

Page 14: Learning Structural SVMs  with Latent Variables

Learning Latent SVM

(yi, y)wT(xi,y,h) +maxy,h

- maxhi wT(xi,yi,hi) ≤ ξi

minw ||w||2 + C Σiξi

Parameters can be regularized

Is this also convex?

Page 15: Learning Structural SVMs  with Latent Variables

Learning Latent SVM

(yi, y)wT(xi,y,h) +maxy,h

- maxhi wT(xi,yi,hi) ≤ ξi

minw ||w||2 + C Σiξi

Convex Convex-

Difference of convex (DC) program

Page 16: Learning Structural SVMs  with Latent Variables

minw ||w||2 + C Σiξi

wTΨ(xi,y,h) + Δ(yi,y) - maxhi wTΨ(xi,yi,hi) ≤ ξi

Scoring function

wTΨ(x,y,h)

Prediction

y(w),h(w) = argmaxy,h wTΨ(x,y,h)

Learning

Recap

Page 17: Learning Structural SVMs  with Latent Variables

• Latent SVM

• Optimization

• Practice

• Extensions

Outline – Annotation Mismatch

Page 18: Learning Structural SVMs  with Latent Variables

Learning Latent SVM

(yi, y)wT(xi,y,h) +maxy,h

- maxhi wT(xi,yi,hi) ≤ ξi

minw ||w||2 + C Σiξi

Difference of convex (DC) program

Page 19: Learning Structural SVMs  with Latent Variables

Concave-Convex Procedure

+

(yi, y)wT(xi,y,h) +

maxy,h

wT(xi,yi,hi)

- maxhi

Linear upper-bound of concave part

Page 20: Learning Structural SVMs  with Latent Variables

Concave-Convex Procedure

+

(yi, y)wT(xi,y,h) +

maxy,h

wT(xi,yi,hi)

- maxhi

Optimize the convex upper bound

Page 21: Learning Structural SVMs  with Latent Variables

Concave-Convex Procedure

+

(yi, y)wT(xi,y,h) +

maxy,h

wT(xi,yi,hi)

- maxhi

Linear upper-bound of concave part

Page 22: Learning Structural SVMs  with Latent Variables

Concave-Convex Procedure

+

(yi, y)wT(xi,y,h) +

maxy,h

wT(xi,yi,hi)

- maxhi

Until Convergence

Page 23: Learning Structural SVMs  with Latent Variables

Concave-Convex Procedure

+

(yi, y)wT(xi,y,h) +

maxy,h

wT(xi,yi,hi)

- maxhi

Linear upper bound?

Page 24: Learning Structural SVMs  with Latent Variables

Linear Upper Bound

- maxhi wT(xi,yi,hi)

-wT(xi,yi,hi*)

hi* = argmaxhi wt

T(xi,yi,hi)

Current estimate = wt

≥ - maxhi wT(xi,yi,hi)

Page 25: Learning Structural SVMs  with Latent Variables

CCCP for Latent SVMStart with an initial estimate w0

Update

Update wt+1 as the ε-optimal solution of

min ||w||2 + C∑i i

wT(xi,yi,hi*) - wT(xi,y,h)≥ (yi, y) - i

hi* = argmaxhiH wtT(xi,yi,hi)

Repeat until convergence

Page 26: Learning Structural SVMs  with Latent Variables

Thanks & QA


Recommended