+ All Categories
Home > Documents > 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering,...

1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering,...

Date post: 05-Jan-2016
Category:
Upload: godwin-copeland
View: 216 times
Download: 2 times
Share this document with a friend
81
1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem University of Teheran)
Transcript
Page 1: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

1

Feature Selection

Jamshid Shanbehzadeh, Samaneh Yazdani

Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem University of Teheran)

Page 2: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

OutlineOutline

2

Page 3: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

OutlineOutline

Part 1: Dimension Reduction Dimension Feature Space Definition & Goals Curse of dimensionality Research and Application Grouping of dimension reduction methods

Part 2: Feature selection Parts of feature set Feature Selection Approach

3

Part 3: Application Of Feature Selection and Software

Page 4: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Part 1:Dimension Reduction

4

Page 5: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Dimension Reduction

Dimension

Dimension Reduction

Dimension

Dimension (Feature or Variable): A measurement of a certain aspect of an object

Two feature of person:• weight• hight

5

Page 6: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Dimension Reduction

Feature Space

Dimension Reduction

Feature Space

Feature Space: An abstract space where each pattern sample is represented as point

6

Page 7: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Large and high-dimensional data Web documents, etc… A large amount of resources are needed in

Information Retrieval Classification tasks Data Preservation etc…

Dimension Reduction

Dimension Reduction

Introduction

Dimension Reduction

Introduction

Page 8: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Dimension Reduction

Definition & Goals

Dimension Reduction

Definition & Goals

Dimensionality reduction: The study of methods for reducing the number of dimensions describing the object

General objectives of dimensionality reduction:

Reduce the computational cost

Improve the quality of data for efficient data-intensive processing tasks

8

Page 9: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Height (cm)

Weight (kg)

140 150

50

60

Dimension Reduction preserves information on classification of overweight and underweight as much as possible makes classification easier reduces data size ( 2 features 1 feature )

Dimension Reduction

Definition & Goals

Dimension Reduction

Definition & Goals

Class 1: overweight

Class 2: underweight

Page 10: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Dimension Reduction

Curse of dimensionality

Dimension Reduction

Curse of dimensionality

As the number of dimension increases, a fix data sample becomes exponentially spars

Example:

Observe that the data become more and more sparse in higher dimensions

Effective solution to the problem of “curse of dimensionality” is: Dimensionality reduction

10

Page 11: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Dimension Reduction

Research and Application

Dimension Reduction

Research and Application

Why dimension reduction is a subject of much research recently?

Massive data of large dimensionality in:

Knowledge discovery

Text mining

Web mining

and . . .

11

Page 12: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Dimension Reduction

Grouping of dimension reduction methods

Dimension Reduction

Grouping of dimension reduction methods

Dimensionality reduction approaches include

Feature Selection

Feature Extraction

12

Page 13: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Dimension Reduction

Grouping of dimension reduction methods : Feature Selection

Dimension Reduction

Grouping of dimension reduction methods : Feature Selection

Dimensionality reduction approaches include

Feature Selection: the problem of choosing a small subset of features that ideally are necessary and sufficient to describe the target concept.

Example

Feature Set= {X,Y} Two Class

Goal: ClassificationGoal: Classification

Feature X Or Feature Y ? Answer: Feature X

13

Page 14: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

14

Feature Selection (FS) Selects feature ex.

Preserves weight

Dimension Reduction

Grouping of dimension reduction methods : Feature Selection

Dimension Reduction

Grouping of dimension reduction methods : Feature Selection

Page 15: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Dimension Reduction

Grouping of dimension reduction methods

Dimension Reduction

Grouping of dimension reduction methods

Dimensionality reduction approaches include

Feature Extraction: Create new feature based on transformations or combinations of the original feature set.

Original Feature {X1,X2}

New Feature

15

Page 16: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

16

Feature Extraction (FE) Generates feature ex.

Preserves weight / height

Dimension Reduction

Grouping of dimension reduction methods

Dimension Reduction

Grouping of dimension reduction methods

Page 17: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Dimension Reduction

Grouping of dimension reduction methods

Dimension Reduction

Grouping of dimension reduction methods

Dimensionality reduction approaches include

Feature Extraction: Create new feature based on transformations or combinations of the original feature set.

N: Number of original features M: Number of extracted features M<N

17

Page 18: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Dimension Reduction

Question: Feature Selection Or Feature Extraction

Dimension Reduction

Question: Feature Selection Or Feature Extraction

Feature Selection Or Feature Extraction

It is depend on the problem. Example

Pattern recognition: problem of dimensionality reduction is to extract a small set of features that recovers most of the variability of the data.

Text mining: problem is defined as selecting a small subset of words or terms (not new features that are combination of words or terms).

Image Compression: problem is finding the best extracted features to describe the image

18

Page 19: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Part 2:Feature selection

19

Page 20: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selectionFeature selection

Thousands to millions of low level features: select the most relevant one to build better, faster, and easier to understand learning machines.

X

n

Nm

20

Page 21: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

21

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Three disjoint categories of features:

Irrelevant

Weakly Relevant

Strongly Relevant

Page 22: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

22

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Goal: Classification

Two Class : {Lion and Deer} We use some features to classify a new instance To which class does

this animal belong

Page 23: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

23

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Goal: Classification

Two Class : {Lion and Deer} We use some feature to classify a new instance

Q: Number of legs?A: 4

So, number of legs is irrelevant feature Feature 1: Number of legs

Page 24: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

24

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Goal: Classification

Two Class : {Lion and Deer} We use some features to classify a new instance

Q: What is its color?A:

So, Color is an irrelevant feature Feature 1: Number of legs

Feature 2: Color

Page 25: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

25

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Goal: Classification

Two Class : {Lion and Deer} We use some features to classify a new instance

Q: What does it eat?A: Grass

So, Feature 3 is a relevant feature Feature 1: Number of legs

Feature 2: Color Feature 3: Type of food

Page 26: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

26

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Goal: Classification

Three Class : {Lion, Deer and Leopard} We use some features to classify a new instance To which class does

this animal belong

Page 27: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

27

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Goal: Classification

Three Class : {Lion, Deer and Leopard} We use some features to classify a new instance

Q: Number of legs?A: 4

So, number of legs is an irrelevant feature Feature 1: Number of legs

Page 28: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

28

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Goal: Classification

Three Class : {Lion, Deer and Leopard} We use some features to classify a new instance

So, Color is a relevant feature

Q: What is its color?A:

Feature 1: Number of legs Feature 2: Color

Page 29: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

29

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Goal: Classification

Three Class : {Lion and Deer and Leopard} We use some features to classify a new instance

So, Feature 3 is a relevant feature

Q: What does it eat?A: meat

Feature 1: Number of legs Feature 2: Color Feature3: Type of food

Page 30: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

30

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Goal: Classification

Three Class : {Lion and Deer and Leopard} We use some feature to classify a new instance

Feature 1: Number of legs Feature 2: Color Feature3: Type of food

Add new feature: FelidaeIt is weakly relevant feature Optimal set: {Color, Type of food} Or {Color, Felidae}

Page 31: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Traditionally, feature selection research has focused on searching for relevant features.

Irrelevant Relevant

Feature set

31

Page 32: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

32

Data setFive Boolean featuresC = F1∨F2

F3 = ┐F2 , F5 = ┐F4

Optimal subset: {F1, F2} or {F1, F3}

Feature selection Parts of feature set

Irrelevant OR Relevant: An Example for the Problem

Feature selection Parts of feature set

Irrelevant OR Relevant: An Example for the Problem

F1 F2 F3 F4 F5 C

0 0 1 0 1 0

0 1 0 0 1 1

1 0 1 0 1 1

1 1 0 0 1 1

0 0 1 1 0 0

0 1 0 1 0 1

1 0 1 1 0 1

1 1 0 1 0 1

Page 33: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Formal Definition 1 (Irrelevance) :

Irrelevance indicates that the feature is not necessary at all.

In previous Example:

F4, F5 irrelevance

F4 and F5 Relevant

33

Page 34: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Definition1(Irrelevance) A feature Fi is irrelevant if

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

34

Irrelevance indicates that the feature is not necessary at all

F be a full set of features

Fi a feature

Si = F −{Fi}.

Page 35: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Categories of relevant features:

Strongly Relevant

Weakly Relevant

StronglyIrrelevant WeaklyRelevant

35

Page 36: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

36

Data setFive Boolean featuresC = F1∨F2

F3 = ┐F2 , F5 = ┐F4

Feature selection Parts of feature set

Irrelevant OR Relevant: An Example for the Problem

Feature selection Parts of feature set

Irrelevant OR Relevant: An Example for the Problem

F1 F2 F3 F4 F5 C

0 0 1 0 1 0

0 1 0 0 1 1

1 0 1 0 1 1

1 1 0 0 1 1

0 0 1 1 0 0

0 1 0 1 0 1

1 0 1 1 0 1

1 1 0 1 0 1

Page 37: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Formal Definition2 (Strong relevance) :

Strong relevance of a feature indicates that the feature is always necessary for an optimal subset

It cannot be removed without affecting the original conditional class distribution.

In previous Example:

Feature F1 is strongly relevant

F4 and F5Weakly F1

37

Page 38: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Definition 2 (Strong relevance) A feature Fi is strongly relevant if

38

Strong relevance of a feature cannot be removed without affecting the original conditional class distribution

Page 39: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Formal Definition 3 (Weak relevance) :

Weak relevance suggests that the feature is not always necessary but may become necessary for an optimal subset at certain conditions.

In previous Example:

F2, F3 weakly relevant

F4 and F5F1F2 and F3

39

Page 40: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Parts of feature set

Irrelevant OR Relevant

Feature selection Parts of feature set

Irrelevant OR Relevant

Definition 3 (Weak relevance) A feature Fi is weakly relevant if

40

Weak relevance suggests that the feature is not always necessary but may become necessary for an optimal subset at certain conditions.

Page 41: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

41

Example:

In order to determine the target concept (C=g(F1, F2)):

F1 is indispensable

One of F2 and F3 can be disposed

Both F4 and F5 can be discarded.

optimal subset: Either {F1, F2} or {F1, F3}

The goal of feature selection is to find either of them.

Feature selection Parts of feature set

Optimal Feature Subset

Feature selection Parts of feature set

Optimal Feature Subset

Page 42: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Parts of feature set

Optimal Feature Subset

Feature selection Parts of feature set

Optimal Feature Subset

Conclusion

An optimal subset should include all strongly relevant features, none of irrelevant features, and a subset of weakly relevant features.

optimal subset: Either {F1, F2} or {F1, F3}

which of weakly relevant features should be selected and which of them removed

42

Page 43: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Parts of feature set

Redundancy

Feature selection Parts of feature set

Redundancy

Solution

Defining Feature Redundancy

43

Page 44: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Redundancy

It is widely accepted that two features are redundant to each other if their values are completely correlated

Feature selection Parts of feature set

Redundancy

Feature selection Parts of feature set

Redundancy

32 FF

In previous Example:

F2, F3 ( )

44

Page 45: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Parts of feature set

Redundancy

Feature selection Parts of feature set

Redundancy

Markov blanket

It used when one feature is correlated with a set of features.

Given a feature Fi, let ,Mi is said to be a Markov blanket for Fi if)( iii MFFM

The Markov blanket condition requires that Mi subsume not only the information that Fi has about C, but also about all of the other features.

45

Page 46: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Parts of feature set

Redundancy

Feature selection Parts of feature set

Redundancy

Redundancy definition further divides weakly relevant features into redundant and non-redundant ones.

StronglyIrrelevant Weakly IIIII

II : Weakly relevant and redundant features

III: Weakly relevant but non-redundant features

Optimal Subset: Strongly relevant features +Weakly relevant but non-redundant features

46

Page 47: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Approaches

Feature selection Approaches

47

Page 48: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Framework of feature selection via subset evaluation

48

Page 49: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

49

Generation Evaluation

Stopping Criterion Validation

OriginalFeature Set Subset

Goodness of the subset

No Yes

1 2

34

Generates subset of features for evaluation

Can start with:

•no features

•all features

•random subset of features

Subset Generation

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Page 50: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

50

Examine all combinations of feature subset.

Example:

{f1,f2,f3} => { {f1},{f2},{f3},{f1,f2},{f1,f3},{f2,f3},{f1,f2,f3} }

Order of the search space O(2d), d - # feature.

Optimal subset is achievable.

Too expensive if feature space is large.

Subset search method -Exhaustive Search Example

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Page 51: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

51

Generation Evaluation

Stopping Criterion Validation

OriginalFeature Set Subset

Goodness of the subset

No Yes

1 2

34

Measures the goodness of the subset

Compares with the previous best subset

if found better, then replaces the previous best subset

Subset Evaluation

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Page 52: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

52

Each feature and feature subset needs to be evaluated based on importance by a criterion.

The existing feature selection algorithms, based on criterion functions used in searching for informative features can be generally categorized as:

Filter model

Wrapper model

Embedded methods

Note: Different criteria may select different features.

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Subset Evaluation

Page 53: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

53

Filter

The filter approach utilizes the data alone to decide which features should be kept, without running the learning algorithm.

The filter approach basically pre-selects the features, and then applies the selected feature subset to the clustering algorithm.

Evaluation function <> Classifier Ignored effect of selected subset on the performance of classifier.

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Page 54: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

54

Filter (1)- Independent Criterion

Some popular independent criteria are

Distance measures (Euclidean distance measure).

Information measures (Entropy, Information gain, etc.)

Dependency measures (Correlation coefficient)

Consistency measures

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Page 55: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

55

Wrappers In wrapper methods, the performance of a learning algorithm is used to evaluate the

goodness of selected feature subsets.

Evaluation function = classifier Take classifier into account.

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Page 56: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

56

Wrappers (2)

Wrappers utilize a learning machine as a “black box” to score subsets of features according to their predictive power.

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Page 57: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

57

Filters Advantages

Fast execution: Filters generally involve a non-iterative computation on the dataset, which can execute much faster than a classifier training session

Generality: Since filters evaluate the intrinsic properties of the data, rather than their interactions with a particular classifier, their results exhibit more generality: the solution will be “good” for a larger family of classifiers

Disadvantages The main disadvantage of the filter approach is that it totally ignores the effects of the selected feature subset on the performance of the induction algorithm

Filters vs. Wrappers

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Page 58: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

58

Filters vs. Wrappers

Wrappers Advantages

Accuracy: wrappers generally achieve better recognition rates than filters since they are tuned to the specific interactions between the classifier and the dataset

Disadvantages

Slow execution: since the wrapper must train a classifier for each feature subset (or several classifiers if cross-validation is used), the method can become infeasible for computationally intensive methods

Lack of generality: the solution lacks generality since it is tied to the bias of the classifier used in the evaluation function.

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Page 59: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Yes, stop!

All features Eliminate

uselessfeature(s)

Train SVM

Eliminateuseless

feature(s)

Train SVM

Performancedegradation?

Train SVM

Eliminateuseless

feature(s)

Train SVM

Train SVM

Eliminateuseless

feature(s)

Eliminateuseless

feature(s)

No, continue…

Recursive Feature Elimination (RFE) SVM. Guyon-Weston, 2000. US patent 7,117,188

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Embedded methods

Page 60: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

60

Generation Evaluation

Stopping Criterion Validation

OriginalFeature Set

Subset

Goodness of the subset

No Yes

1 2

34

Based on Generation rcdure:

•Pre-defined number of features

•Pre-defined number of iterations

Based on Evaluation Function:

•whether addition or deletion of a

feature does not produce a better

subset•whether optimal subset based on

some evaluation function is achieved

Stopping Criterion

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Page 61: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

61

Generation Evaluation

Stopping Criterion Validation

OriginalFeature Set Subset

Goodness of the subset

No Yes

1 2

3 4

Basically not part of the feature selection process itself

- compare results with already established results or results from competing feature selection methods

Result Validation

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Page 62: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

62

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

A feature subset selected by this approach approximates the optimal subset:

Subset Evaluation: Advantage

StronglyIrrelevant Weakly IIIII

II : Weakly relevant and redundant features

III: Weakly relevant but non-redundant features

Optimal Subset: Strongly relevant features +Weakly relevant but non-redundant features

Page 63: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

63

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Feature selection Approaches : Subset Evaluation (Feature Subset Selection )

Subset Evaluation: Disadvantages

High computational cost of the subset search makes subset evaluation approach inefficient for high dimensional data.

Page 64: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Approaches

Feature selection Approaches

64

Page 65: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

65

Individual method (Feature Ranking / Feature weighting)

Individual methods evaluate each feature individually according to a criterion.

They then select features, which either satisfy a condition or are top-ranked.

Exhaustive, greedy and random searches are subset search methods because they evaluate each candidate subset.

Feature selection Approaches : Individual Evaluation (Feature Weighting/Ranking)

Feature selection Approaches : Individual Evaluation (Feature Weighting/Ranking)

Page 66: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

66

linear time complexity in terms of dimensionality N.

Individual method is efficient for high-dimensional data.

Individual Evaluation: Advantage

Feature selection Approaches : Individual Evaluation (Feature Weighting/Ranking)

Feature selection Approaches : Individual Evaluation (Feature Weighting/Ranking)

Page 67: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

67

Individual Evaluation: Disadvantages

Feature selection Approaches : Individual Evaluation (Feature Weighting/Ranking)

Feature selection Approaches : Individual Evaluation (Feature Weighting/Ranking)

It is incapable of removing redundant features.

For high-dimensional data which may contain a large number of redundant features, this approach may produce results far from optimal.

StronglyIrrelevant Weakly IIIII

Select= Weakly + Strongly

Page 68: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection Approaches

Feature selection Approaches

68

Page 69: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Feature selection New Framework

Feature selection New Framework

69

New framework of feature selection composed of two steps:

First Step (Relevance analysis): determines the subset of relevant features by removing irrelevant ones.

Second Step (redundancy analysis): determines and eliminates redundant features from relevant ones and thus produces the final subset.

New Framework

Page 70: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Part 3:Applications of Feature Selection

AndSoftware

70

Page 71: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

71

Feature selection Applications of Feature Selection

Feature selection Applications of Feature Selection

Page 72: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Internet

Information explosive 80% information stored in text documents: journals, web pages, emails... Difficult to extract special information Current technologies...

Feature selection Applications of Feature Selection

Text categorization: Importance

Feature selection Applications of Feature Selection

Text categorization: Importance

Page 73: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

73

Feature selection Applications of Feature Selection

Text categorization

Feature selection Applications of Feature Selection

Text categorization

Assigning documents to a fixed set of categories

Newsarticle categorizer

sports

cultures

health

politics

economics

vacations

Page 74: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

74

Text-Categorization

Documents are represented by a vector of dimension the size of the vocabulary containing word frequency counts

Vocabulary ~ 15.000 words (i.e. each document is represented by a 15.000-dimensional vector)

Typical tasks: - Automatic sorting of documents into web-directories- Detection of spam-email

Feature selection Applications of Feature Selection

Text categorization

Feature selection Applications of Feature Selection

Text categorization

Page 75: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

75

Feature selection Applications of Feature Selection

Text categorization

Feature selection Applications of Feature Selection

Text categorization

Major characteristic, or difficulty of text categorization:

High dimensionality of the feature space

Goal: Reduce the original feature space without sacrificing categorization accuracy

Page 76: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

76

Feature selection Applications of Feature Selection

Image retrieval

Feature selection Applications of Feature Selection

Image retrieval

Importance: Rapid increase of the size and amount of image collections from both civilian and military equipments

Problem: Cannot access to or make use of the information unless it is organized.

Content-based image retrieval: Instead of being manually annotated bytext-based keywords, images would be indexed by their own visual contents (features), such as color, texture, shape, etc.

One of the biggest problems to make content-based image retrieval trulyscalable to large size image collections is still the “curse of dimensionality

Page 77: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

77

Paper: ReliefF Based Feature Selection In Content-Based Image Retrieval

A. sarrafzadeh, Habibollah Agh Atabay, Mir Mosen Pedram, Jamshid Shanbehzadeh

Feature selection Applications of Feature Selection

Image retrieval

Feature selection Applications of Feature Selection

Image retrieval

Image dataset: Coil-20 contains :

1440 grayscale pictures from 20 classes of objects.

Page 78: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

78

Feature selection Applications of Feature Selection

Image retrieval

Feature selection Applications of Feature Selection

Image retrieval

In this paper They use :

Legendre moments to extract features ReliefF algorithm to select the most relevant and non-redundant features Support vector machine to classify images.

The effects of features on classification accuracy

Page 79: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

Weka is a piece of software, written in Java, that provides an array of machine learning tools, many of which can be used for data mining

Pre-processing data Features selection Features extraction Regression Classify data Clustering data Associate rules

More functions Create random data set Connect data sets in other formats Visualize data …….

Feature selection Weka Software: What we can do with ?

Feature selection Weka Software: What we can do with ?

Page 80: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

80

ReferencesReferences

[1] M. Dash and H.Liu, “Dimensionality Reduction, in Encyclopedia of ComputerScience and Engineering,” John Wiley & Sons, Inc 2,958-966, 2009.

[2]H. Liu and L. Yu, "Toward Integrating Feature Selection Algorithms for Classification and Clustering", presented at IEEE Trans. Knowl. Data Eng, vol. 17, no.4, pp.491-502, 2005.

[3]I.Guyon and A.Elisseeff, "An introduction to variable and feature selection", Journal of Machine Learning Research 3, 1157–1182, 2003.

[4] L. Yu and H. Liu, “Efficient Feature Selection via Analysis of Relevance and Redundancy", presented at Journal of Machine Learning Research, vol. 5, pp.1205-1224, 2004.

[5] H. Liu, and H. Motoda, "Computational methods of feature selection", Chapman and Hall/CRC Press, 2007.

[6] I.Guyon, Lecture 2: Introduction to Feature Selection.

[7] M.Dash and H.liu, Feature selection for classification.

Page 81: 1 Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem.

81

ReferencesReferences

[8] Makoto Miwa, A Survey on Incremental Feature Extraction

[9] Lei Yu, Feature Selection and Its Application in Genomic Data Analysis


Recommended