• Semi-Supervised LearningSemi-Supervised Learning
• Transfer LearningTransfer Learning
• Active LearningActive Learning
• SummarySummary
Modern Topics in Multivariate Methods for Data Analysis
Semi-Supervised Learning
This is an extension to supervised learning. We have two sets of data:
Motivation: labeled data is sometimes hard to obtain.
Figure obtained from X. Zhu. Semi-Supervised Learning Tutorial. ICML 2007
An example from Mars Data Analysis
Digital Elevation MapGeomorphic Map
Martian landscape
Manually drawn geomorphic map of this landscape
Geomorphic map shows landforms chosen and defined by a domain expert.
Segmentation: Results.
Displayed on an elevation background.
2631 segments homogeneous in slope, curvature and flood.
Classification: Labeling.
A representative subset of objects are labeled as one of the following six classes: Plain Crater Floor Convex Crater Walls Concave Crater Walls Convex Ridges Concave Ridges
Labeled segments.
Figure obtained from X. Zhu. Semi-Supervised Learning Tutorial. ICML 2007
How do we approach semi-supervised learning?
Figure obtained from X. Zhu. Semi-Supervised Learning Tutorial. ICML 2007
A Case with No Unlabeled Data
Figure obtained from X. Zhu. Semi-Supervised Learning Tutorial. ICML 2007
A Case with Unlabeled Data
Figure obtained from X. Zhu. Semi-Supervised Learning Tutorial. ICML 2007
A Case with Unlabeled Data
Figure obtained from X. Zhu. Semi-Supervised Learning Tutorial. ICML 2007
A Case with Unlabeled Data
How can we learn from unlabeled data at all?
The answer lies in the set of assumptions about theunlabeled data distribution.
If assumptions are right, an advantage can be obtainedusing unlabeled data
But a decrease in performance is possible if assumptions are incorrect.
Assumptions in Semi-Supervised Learning
• Semi-Supervised LearningSemi-Supervised Learning
• Transfer LearningTransfer Learning
• Active LearningActive Learning
• SummarySummary
Modern Topics in Multivariate Methods for Data Analysis
• The goal is to transfer knowledge gathered from previous experience.
• Also called Inductive Transfer or Learning to Learn.
• Example: Invariant transformations across tasks.
Transfer Learning
Motivation for transfer learning
Once a predictive model is built, there are reasons to believe the model will cease to be valid at some point in time.
The difference is that now source and target domains can be completely different.
Motivation Transfer Learning
Transfer Learning
DB1 DB2
DB new
Learning System
Learning System
Learning SystemKnowledge
Source domain
Target domain
Transfer Learning
Scenarios:
1.Labeling in a new domain is costly.
DB1 (labeled)
Classification of Patients G1
DB2 (unlabeled)
Classification of Patients G2
Transfer Learning
Scenarios:
2. Data is outdated. Model created with one survey buta new survey is now available.
Survey 1
Learning System
Survey 2
?
Input nodesInput nodes
Internal nodesInternal nodes
Output nodesOutput nodes
Left Left StraightStraight RightRight
Functional Transfer: Multitask Learning
Train in Parallel with Combined Architecture
Figure obtained from Brazdil, et. Al. Metalearning: Applications to Data Mining, Chapter 7, Springer, 2009.
Knowledge of Parameters
Assume prior distribution of parameters
Source domain
Learn parameters and adjust prior distribution
Target domain
Learn parameters using the source priordistribution.
P(y|x) = P(x|y) P(y) / P(x)
Parameter Similarity
Task A Parameter A
Task B Parameter B ~ A
Assume hyper-distribution with low variance.
Assume Parameter Similarity
Knowledge of Parameters
Find coefficients ws using SVMs
Find coefficients wT using SVMsinitializing the search with ws
Feature Transfer
Feature Transfer:
Target domain
Source domain
Shared representation across tasks
Minimize Loss-Function( y, f(x))
The minimization is done over multiple tasks (multiple regions on Mars).
Instance Transfer Learning
Instance Transfer:
Learning System
Target domainSource
domainFilter samples
Larger target dataset
New program calledTrAdaboost
• Semi-Supervised LearningSemi-Supervised Learning
• Transfer LearningTransfer Learning
• Active LearningActive Learning
• SummarySummary
Modern Topics in Multivariate Methods for Data Analysis
Active learning is part of the field of supervised learning.
We have labeled and unlabeled data. The novel idea is thatwe can choose which examples to label during learning.
It is also called “Query Learning”.
Labeled Data
Unlabeled Data Select examples
Active Learning
Types of Active Learning:
1. Query Synthesis.
The learner can request an example from anywhere in theinstance space. It is only appropriate with small finite
domains.Some examples may have no meaning.
Active Learning
Types of Active Learning:
2. Stream-Based Selective Sampling
Instances are drawn from the input space according to a distribution, and the learner can decide to discard it or not. For example, one can only choose examples from regions of uncertainty.
Active Learning
Types of Active Learning:
3. Pool-Based Sampling
Assume a small set of labeled examples and a large set of unlabeled examples. Here we evaluate and rank the whole set of unlabeled examples; we then choose one or more examples.
Active Learning
Sampling Based on Uncertainty
Figure taken from “Active Learning” by Burr Settles, Morgan & Claypool, 2012.
70% accuracy 90% accuracy
• Semi-Supervised LearningSemi-Supervised Learning
• Transfer LearningTransfer Learning
• Active LearningActive Learning
• SummarySummary
Modern Topics in Multivariate Methods for Data Analysis
Few labeled examples, labeling is expensive, Few labeled examples, labeling is expensive,
many unlabeled examples many unlabeled examples Semi-Supervised Semi-Supervised
Similar classification tasks but there is indication that Similar classification tasks but there is indication that
the distributions have changed the distributions have changed Transfer Learning Transfer Learning
Few training examples, labeling is expensive Few training examples, labeling is expensive Active Learning Active Learning
Summary