Greg GrudicIntro AI1 Introduction to Artificial Intelligence CSCI 3202 Fall 2007 Introduction to...

Greg Grudic Intro AI 1

Introduction to Artificial IntelligenceCSCI 3202Fall 2007

Introduction to ClassificationGreg Grudic


This Class: Classification Models

• Collect Training data• Construct Model: happy = F(feature space)• Make a prediction

HighDimensional

Feature (input)Space


Binary Classification

• A binary classifier is a mapping from a set of d inputs to a single output which can take on one of TWO values (e.g. path/no path)

• In the most general setting

• Specifying the output classes as -1 and +1 is arbitrary!– Often done as a mathematical convenience

{ }inputs:

output:

1, 1

d

y

x Î Â

Î - +


A Binary Classifier

ClassificationModelx ˆ 1, 1y

Given learning data: ( ) ( )1 1, ,..., ,N Ny yx x

A model is constructed:

( )M xNot in learning set!


Classification Learning Data…

Example 1 0.95013 0.58279 1

Example 2 0.23114 0.4235 -1

Example 3 0.8913 0.43291 1

Example 4 0.018504 0.76037 -1

… … … …

1x2x y


The Learning Data

• Matrix Representation of N learning examples of d dimensional inputs

11 1 1

1

d

N Nd N

x x y

x x y

æ ö÷ç ÷ç ÷ç ÷ç ÷ç ÷ç ÷÷çç ÷è ø

K

M O M M

L


Graphical Representation of 2D Classification Training Data

0 0.2 0.4 0.6 0.8 1 1.20

0.2

0.4

0.6

0.8

1

1.2

x1

x 2

: y=+1: y=-1


Linear Separating Hyper-Planes: Discriminative Classifiers

How many lines can separate these points?

NO!


Linear Separating Hyper-Planes (2 dimensions)

1x

2x

0 1 1 2 2 0x xb b b+ + £

0 1 1 2 2 0x xb b b+ + >

0 1 1 2 2 0x xb b b+ + =

1y =-

1y =+( ) 3

0 1 2ˆ ˆ ˆ, ,b b b Î Â


Linear Separating Hyper-Planes (d dimensions)

1x

2x

01

0d

i ii

xb b=

+ £å

01

0d

i ii

xb b=

+ >å

01

0d

i ii

xb b=

+ =å

1y =-

1y =+( ) 1

0 1ˆ ˆ ˆ, ,..., d

db b b +Î Â


Linear Separating Hyper-Planes

• The Model:

• Where:

• The decision boundary:

( )0 1ˆ ˆ ˆˆ ( ) sgn ,..., dy M x xb b bé ù= = + ×ê úë û

[ ]1 if 0

sgn1 otherwise

AA

ì >ïï=íï -ïî

( )0 1 01

ˆ ˆ ˆ,..., 0d

d i ii

xxb b b b b=

+ × = + =å


Linear Separating Hyper-Planes

• The model parameters are:

• The hat on the betas means that they are estimated from the data

• Many different learning algorithms have been proposed for determining

( ) 10 1

ˆ ˆ ˆ, ,..., ddb b b +Î Â

( )0 1ˆ ˆ ˆ, ,..., db b b

Is this Data Linearly Separable?


NO!



YES!



NO!



YES!

Date post:	20-Dec-2015
Category:	Documents
View:	219 times
Download:	0 times

Greg GrudicIntro AI1 Introduction to Artificial Intelligence CSCI 3202 Fall 2007 Introduction to...

Documents