Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | pearl-arnold |
View: | 221 times |
Download: | 0 times |
Lecturer: Liqing ZhangLecturer: Liqing Zhang
Dept. Computer Science & Engineering, Shanghai Jiao Tong University
Statistical Learning Statistical Learning & Inference& Inference
23/4/20 Statistical Learning and Inference 2
Books and References– Trevor Hastie Robert Tibshirani Jerome Friedman , The Elements The Elements
of statistical Learning: of statistical Learning: Data Mining, Inference, and Prediction, Data Mining, Inference, and Prediction, 2001, 2001, Springer-VerlagSpringer-Verlag
– V. Cherkassky & F. Mulier, Learning From Data, Wiley,1998
– Vladimir N. Vapnik, The Nature of Statistical Learning Theory, 2nd ed., Springer, 2000
– M. Vidyasagar, Learning and generalization: with applications to neural networks, 2nd ed., Springer, 2003
– G. Casella & R. Berger, Statistical Inference, Thomson, 2002
– T. Cover & J. Thomas, Elements of Information Theory, Wiley
23/4/20 Statistical Learning and Inference 3
Overview of the Course Introduction Overview of Supervised Learning Linear Method for Regression and Classification Basis Expansions and Regularization Kernel Methods Model Selections and Inference Support Vector Machine Bayesian Inference Unsupervised Learning
23/4/20 Statistical Learning and Inference 4
Why Statistical Learning? 我门被信息淹没,但却缺乏知识。 ---- R. Roger
恬静的统计学家改变了我们的世界;不是通过发现新的事实或者开发新技术,而是通过改变我们的推理、实验和观点的形成方式。 ---- I. Hacking
问题:为什么现在的计算机处理智能信息效率很低?– 图像、视频、音频
– 认知、交流
– 语言、语音、文本
– 生物、基因、蛋白
Cloud Computing
Services
Application
Development
Platform
Storage
Hosting
Cloud Computing Service Layers
DescriptionDescriptionServices – Complete business services such as Services – Complete business services such as PayPal, OpenID, OAuth, Google Maps, AlexaPayPal, OpenID, OAuth, Google Maps, Alexa
Services
Application
Focused
Infrastructure
Focused
Application – Cloud based software that Application – Cloud based software that eliminates the need for local installation such eliminates the need for local installation such as Google Apps, Microsoft Onlineas Google Apps, Microsoft Online
Storage – Data storage or cloud based NAS Storage – Data storage or cloud based NAS such as CTERA, iDisk, CloudNASsuch as CTERA, iDisk, CloudNAS
Development – Software development Development – Software development platforms used to build custom cloud based platforms used to build custom cloud based applications (PAAS & SAAS) such as SalesForceapplications (PAAS & SAAS) such as SalesForce
Platform – Cloud based platforms, typically Platform – Cloud based platforms, typically provided using virtualization, such as Amazon provided using virtualization, such as Amazon ECC, Sun GridECC, Sun Grid
Hosting – Physical data centers such as those Hosting – Physical data centers such as those run by IBM, HP, NaviSite, etc.run by IBM, HP, NaviSite, etc.
心电采集和初步诊断
社区医院
社区医院
远程诊疗与监护中心
数据发送
自动诊断和辅助诊断数据共享(远程医生)咨询系统
人工诊断和治疗建议诊断结果反馈
调动社区医院空闲资源
问题心电发送
反馈治疗建议
个人用户需要更多的功能:•疾病监护 /心肺疾病•康复训练•健身指导等
社区医院也需要更多的功能:•心电、呼吸、血压•慢性病康复训练•健身指导等
医院医生也是远程种新的用户
23/4/20 Statistical Learning and Inference 8
ML: SARS Risk PredictionSARS Risk
Age
Gen
der
Blo
od P
ress
ure
Che
st X
-Ray
Pre-Hospital Attributes
Alb
umin
Blo
od p
O2
Whi
te C
ount
RB
C C
ount
In-Hospital Attributes
23/4/20 Statistical Learning and Inference 9
ML: Auto Vehicle Navigation
Steering Direction
23/4/20 Statistical Learning and Inference 10
Protein Folding
23/4/20 Statistical Learning and Inference 11
The Scale of Biomedical Data
General Procedure in SL
ML
Procedure
Predictions
Problem Definition
Data Acquisition
Model Training
Feature Analysis
EX. Pattern Classification Objective: To recognize horse in images
Procedure: Feature => Classifier => Cross+Valivation
23/4/20 Statistical Learning and Inference 13
Classifier
23/4/20 Statistical Learning and Inference 14
Horse Horse
Non Horse Non Horse
23/4/20 Statistical Learning and Inference 15
Function Estimation Model The Function Estimation Model of learning
examples:– Generator (G) generates observations x (typically in Rn),
independently drawn from some fixed distribution F(x)
– Supervisor (S) labels each input x with an output value y according to some fixed distribution F(y|x)
– Learning Machine (LM) “learns” from an i.i.d. l-sample of (x,y)-pairs output from G and S, by choosing a function that best approximates S from a parameterised function class f(x,), where is in the parameter set
23/4/20 Statistical Learning and Inference 16
Function Estimation Model
Key concepts: F(x,y), an i.i.d. k-sample on F, functions f(x,) and the equivalent representation of each f using its index
xG S
LM
y
y
23/4/20 Statistical Learning and Inference 17
The loss functional (L, Q)– the error of a given function on a given example
The risk functional (R)
– the expected loss of a given function on an example drawn from F(x,y)
– the (usual concept of) generalisation error of a given function
The Problem of Risk Minimization
,,,:
,,,,:
xy zfzLzQ
xfyLfyxL
,R Q z dF z
23/4/20 Statistical Learning and Inference 18
The Problem of Risk Minimization Three Main Learning Problems
– Pattern Recognition:
– Regression Estimation:
– Density Estimation:
,,, and 1,0 xfyxfyLy 1
2,,, and xfyxfyLy
,log, and 1,0 xpxpLy
23/4/20 Statistical Learning and Inference 19
General Formulation The Goal of Learning
– Given an i.i.d. k-sample z1,…, zk drawn from a fixed distribution F(z)
– For a function class’ loss functionals Q (z ,), with in
– We wish to minimise the risk, finding a function *
R
minarg*
23/4/20 Statistical Learning and Inference 20
General Formulation The Empirical Risk Minimization (ERM) Inductive
Principle
– Define the empirical risk (sample/training error):
– Define the empirical risk minimiser:
– ERM approximates Q (z ,*) with Q (z ,k) the Remp minimiser…that is ERM approximates * with k
– Least-squares and Maximum-likelihood are realisations of ERM
k
iizQ
kR
1emp ,
1
empminarg Rk
23/4/20 Statistical Learning and Inference 21
4 Issues of Learning Theory1. Theory of consistency of learning processes
• What are (necessary and sufficient) conditions for consistency (convergence of Remp to R) of a learning process based on the ERM
Principle?
2. Non-asymptotic theory of the rate of convergence of learning processes
• How fast is the rate of convergence of a learning process?
3. Generalization ability of learning processes
• How can one control the rate of convergence (the generalization ability) of a learning process?
4. Constructing learning algorithms (i.e. the SVM)
• How can one construct algorithms that can control the generalization ability?
23/4/20 Statistical Learning and Inference 22
Change in Scientific MethodologyTRADITIONAL
Formulate hypothesis Design experiment Collect data Analyze results Review hypothesis Repeat/Publish
NEW
Design large experiments Collect large data Put data in large database Formulate hypothesis Evaluate hypothesis on
database Run limited experiments Review hypothesis Repeat/Publish
23/4/20 Statistical Learning and Inference 23
Learning & Adaptation
Any method that incorporates information from training samples in the design of a classifier employs learning.
Due to complexity of classification problems, we cannot guess the best classification decision ahead of time, we need to learn it.
Creating classifiers then involves positing some general form of model, or form of the classifier, and using examples to learn the complete classifier.
23/4/20 Statistical Learning and Inference 24
Supervised learning In supervised learning, a teacher provides a
category label for each pattern in a training set. These are then used to train a classifier which can thereafter solve similar classification problems by itself.– Such as Face Recognition, Text Classification, ……
23/4/20 Statistical Learning and Inference 25
Unsupervised learning In unsupervised learning, or clustering, there is no
explicit teacher or training data. The system forms natural clusters of input patterns and classifiers them based on clusters they belong to .
– Data Clustering, Data Quantization, Dimensional Reduction, ……
23/4/20 Statistical Learning and Inference 26
Reinforcement learning In reinforcement learning, a teacher only says to
classifier whether it is right when suggesting a category for a pattern. The teacher does not tell what the correct category is.
– Agent, Robot, ……
23/4/20 Statistical Learning and Inference 27
Classification The task of the classifier component is to use the feature
vector provided by the feature extractor to assign the object to a category.
Classification is the main topic of this course. The abstraction provided by the feature vector
representation of the input data enables the development of a largely domain-independent theory of classification.
Essentially the classifier divides the feature space into regions corresponding to different categories.
23/4/20 Statistical Learning and Inference 28
Classification
The degree of difficulty of the classification problem depends on the variability in the feature values for objects in the same category relative to the feature value variation between the categories.
Variability is natural or is due to noise.
Variability can be described through statistics leading to statistical pattern recognition.
23/4/20 Statistical Learning and Inference 29
Classification Question: How to design a classifier that can cope
with the variability in feature values? What is the best possible performance?
S(x)>=0 Class A S(x)<0 Class B
S(x)=0
Objects
X2(area)
(perimeter) X1
Object Representation in Feature Space
Noise and Biological Variations Cause Class Spread
Classification error due to class overlap
Examples User interfaces: modelling subjectivity and affect,
intelligent agents, transduction (input from camera, microphone, or fish sensor)
Recovering visual models: face recognition, model-based video, avatars
Dynamical systems: speech recognition, visual tracking, gesture recognition, virtual instruments
Probabilistic modeling: image compression, low bandwidth teleconferencing, texture synthesis
……
23/4/20 Statistical Learning and Inference 30
23/4/20 Statistical Learning and Inference 31
Course Web http://bcmi.sjtu.edu.cn/statLearnig/
Teaching Assistant: Liu Ye Email: [email protected]
Assignment To write a report on the topic you are working on,
including:– Problem definition– Model and method– Key issues to be solved– Outcome
23/4/20 Statistical Learning and Inference 32