On Machine Learning
at CSTalks
by Vlad Hosu
Introduction
Fundamental Questions
•What are the fundamental laws that govern all learning processes?
•How can we build computer systems that automatically improve with experience?
3
Learning: Method
•a process of adaption
•by which a parametric model is automatically adjusted
•so that some fitness criteria is more readily met
4
Before Learning
I’m learning, hence I need
adapt!
5
After
Result: Liony adjusts his
diet.
6
Biological Learning
•Model: nervous systemneuron connectivity, chemical changes etc
•Fitness: improved behaviorskills, memory, knowledge
7
Machine Learning
•a mathematical model
•with adjustable parameters
•optimizing some fitness function
8
Motivation
9
Why?
•some things are hard to code
•too much data
•automatic learning works better
•is easier to customize/personalize
10
Learning: Purpose
•estimation
•function - stock market
•class - recognition
•structure - grouping
Requirements
•good learning ability
•scalability to large problems
•simple and easy algorithm implementation
12
Things Ahead• Problems
• Clustering
• Classification
• Regression
• Learning issues
• importance of domain knowledge
• learning/generalization ability
• model complexity issues
• Optimization
13
Important Problems
14
Clustering
15
Classificationx1
x2
16
Classification
•Types
•discriminative
•generative
x1
x2
17
Classification
•Types
•discriminative
•generative
x1
x21 0
18
Classification
19
Regression
20
Making Connections
•discrete value regression =>generative classification
•regression on boundary space => discriminative classification
•clustering + labels => classification
21
Learning Issues
22
Domain Knowledge
•exploitation of problem structure
•human abstractions are better
•important for picking the right model
23
Grouping in Images
•groups together similar parts of an image
•select objects
•find patterns
•features = pixel values (function of)
24
Segmentation
25
Color Space
RGB
space
RGB
space
26
Color Space (cont)
27
Suitable Clustering
28
Generalization Ability
•training data generalizes to new data
•important for classification accuracy
29
Support Vector Machines (SVM)
•linear classifier on distorted space
30
Learning Ability
overfittin
g
31
Problems with Over-fitting
32
SVM vs Decision Trees
33
Complexity Issues
•models should be
•as simple as possible
•but representative of the training data
34
Neural Networks
•model: weights
•fitness: output error
•general function ∑
35
Training a Network
36
Non-trivial Functions
37
Optimization
38
Optimizing Fitness
•find extrema
•strategies
•gradient descent
•convex optimization
39
Optimization
•finding extrema
•local/global
40
Gradient Descent
41
Problem: Local Extrema
42
Problem: Speed
43
Linear Programming
x1
x2
lines define a convex function
planes in 3D etc
44
Considerations
•scaling to large features spaces
•feature selection
•dimensionality reduction
45
Open Problems
46
Open Problems
•unlabeled data for regression
•exploiting sparsity in high dimensional spaces for non-parametric learning
•transferring learnt information from one task to simplify learning another
47
Open Problems (cont)
•algorithms for learning control strategies from delayed rewards and other inputs
•best “active learning” strategies for different learning problems
•degree one can preserve data privacy while obtaining the benefits of data mining
48
The endQuestions?
Types of Regression
•parametric
•non-parametric
50
Linear vs Non-linear
• linear
• smooth
• under-fitting
• good enough for some processes (biz)
• non-linear
• complex
• over-fitting
• works on most data-sets
51
Naive Bayes
good
spam
write people
free
π
π
No.Good
No.Spam
*
*
52
Graph Clustering
53
Mean Shift
54
Problems in CV•What are the physical and geometric
processes that govern (digital) imaging?
•What are the “informative” areas of an image and how do we detect them?
•What portions of an image pertain to one another and to relevant physical phenomena?
•From one (or more) images, how can we determine the geometry of the scene?
55
Linear Regression
•model: straight line
•2 adjustable parameters
•fitness function: root mean squared error
56
Solution Stability
y-shift
slop
e
57
Some Issues with Model Selection
normal
outliers
wrongmodel
58
Real Photo in Color Space
EM KMeans59
Conjugate Gradient
60
Newton’s Method
61