CSTalks - On machine learning - 2 Mar

Post on 20-Nov-2014

653 views 5 download

Tags:

description

 

transcript

On Machine Learning

at CSTalks

by Vlad Hosu

Introduction

Fundamental Questions

•What are the fundamental laws that govern all learning processes?

•How can we build computer systems that automatically improve with experience?

3

Learning: Method

•a process of adaption

•by which a parametric model is automatically adjusted

•so that some fitness criteria is more readily met

4

Before Learning

I’m learning, hence I need

adapt!

5

After

Result: Liony adjusts his

diet.

6

Biological Learning

•Model: nervous systemneuron connectivity, chemical changes etc

•Fitness: improved behaviorskills, memory, knowledge

7

Machine Learning

•a mathematical model

•with adjustable parameters

•optimizing some fitness function

8

Motivation

9

Why?

•some things are hard to code

•too much data

•automatic learning works better

•is easier to customize/personalize

10

Learning: Purpose

•estimation

•function - stock market

•class - recognition

•structure - grouping

Requirements

•good learning ability

•scalability to large problems

•simple and easy algorithm implementation

12

Things Ahead• Problems

• Clustering

• Classification

• Regression

• Learning issues

• importance of domain knowledge

• learning/generalization ability

• model complexity issues

• Optimization

13

Important Problems

14

Clustering

15

Classificationx1

x2

16

Classification

•Types

•discriminative

•generative

x1

x2

17

Classification

•Types

•discriminative

•generative

x1

x21 0

18

Classification

19

Regression

20

Making Connections

•discrete value regression =>generative classification

•regression on boundary space => discriminative classification

•clustering + labels => classification

21

Learning Issues

22

Domain Knowledge

•exploitation of problem structure

•human abstractions are better

•important for picking the right model

23

Grouping in Images

•groups together similar parts of an image

•select objects

•find patterns

•features = pixel values (function of)

24

Segmentation

25

Color Space

RGB

space

RGB

space

26

Color Space (cont)

27

Suitable Clustering

28

Generalization Ability

•training data generalizes to new data

•important for classification accuracy

29

Support Vector Machines (SVM)

•linear classifier on distorted space

30

Learning Ability

overfittin

g

31

Problems with Over-fitting

32

SVM vs Decision Trees

33

Complexity Issues

•models should be

•as simple as possible

•but representative of the training data

34

Neural Networks

•model: weights

•fitness: output error

•general function ∑

35

Training a Network

36

Non-trivial Functions

37

Optimization

38

Optimizing Fitness

•find extrema

•strategies

•gradient descent

•convex optimization

39

Optimization

•finding extrema

•local/global

40

Gradient Descent

41

Problem: Local Extrema

42

Problem: Speed

43

Linear Programming

x1

x2

lines define a convex function

planes in 3D etc

44

Considerations

•scaling to large features spaces

•feature selection

•dimensionality reduction

45

Open Problems

46

Open Problems

•unlabeled data for regression

•exploiting sparsity in high dimensional spaces for non-parametric learning

•transferring learnt information from one task to simplify learning another

47

Open Problems (cont)

•algorithms for learning control strategies from delayed rewards and other inputs

•best “active learning” strategies for different learning problems

•degree one can preserve data privacy while obtaining the benefits of data mining

48

The endQuestions?

Types of Regression

•parametric

•non-parametric

50

Linear vs Non-linear

• linear

• smooth

• under-fitting

• good enough for some processes (biz)

• non-linear

• complex

• over-fitting

• works on most data-sets

51

Naive Bayes

good

spam

write people

free

π

π

No.Good

No.Spam

*

*

52

Graph Clustering

53

Mean Shift

54

Problems in CV•What are the physical and geometric

processes that govern (digital) imaging?

•What are the “informative” areas of an image and how do we detect them?

•What portions of an image pertain to one another and to relevant physical phenomena?

•From one (or more) images, how can we determine the geometry of the scene?

55

Linear Regression

•model: straight line

•2 adjustable parameters

•fitness function: root mean squared error

56

Solution Stability

y-shift

slop

e

57

Some Issues with Model Selection

normal

outliers

wrongmodel

58

Real Photo in Color Space

EM KMeans59

Conjugate Gradient

60

Newton’s Method

61