Lecturer: Liqing Zhang Dept. Computer Science & Engineering, Shanghai Jiao Tong University...

Lecturer: Liqing ZhangLecturer: Liqing Zhang

Dept. Computer Science & Engineering, Shanghai Jiao Tong University

Statistical Learning Statistical Learning & Inference& Inference

23/4/20 Statistical Learning and Inference 2

Books and References– Trevor Hastie Robert Tibshirani Jerome Friedman , The Elements The Elements

of statistical Learning: of statistical Learning: Data Mining, Inference, and Prediction, Data Mining, Inference, and Prediction, 2001, 2001, Springer-VerlagSpringer-Verlag

– V. Cherkassky & F. Mulier, Learning From Data, Wiley,1998

– Vladimir N. Vapnik, The Nature of Statistical Learning Theory, 2nd ed., Springer, 2000

– M. Vidyasagar, Learning and generalization: with applications to neural networks, 2nd ed., Springer, 2003

– G. Casella & R. Berger, Statistical Inference, Thomson, 2002

– T. Cover & J. Thomas, Elements of Information Theory, Wiley

http://www-stat.stanford.edu/~hastie/

http://www-stat.stanford.edu/~tibs/

http://www-stat.stanford.edu/people/faculty/friedman.html


Overview of the Course Introduction Overview of Supervised Learning Linear Method for Regression and Classification Basis Expansions and Regularization Kernel Methods Model Selections and Inference Support Vector Machine Bayesian Inference Unsupervised Learning


Why Statistical Learning? 我门被信息淹没，但却缺乏知识。 ---- R. Roger

恬静的统计学家改变了我们的世界；不是通过发现新的事实或者开发新技术，而是通过改变我们的推理、实验和观点的形成方式。 ---- I. Hacking

问题：为什么现在的计算机处理智能信息效率很低？– 图像、视频、音频

– 认知、交流

– 语言、语音、文本

– 生物、基因、蛋白

Cloud Computing

Services

Application

Development

Platform

Storage

Hosting

Cloud Computing Service Layers

DescriptionDescriptionServices – Complete business services such as Services – Complete business services such as PayPal, OpenID, OAuth, Google Maps, AlexaPayPal, OpenID, OAuth, Google Maps, Alexa

Services

Application

Focused

Infrastructure

Focused

Application – Cloud based software that Application – Cloud based software that eliminates the need for local installation such eliminates the need for local installation such as Google Apps, Microsoft Onlineas Google Apps, Microsoft Online

Storage – Data storage or cloud based NAS Storage – Data storage or cloud based NAS such as CTERA, iDisk, CloudNASsuch as CTERA, iDisk, CloudNAS

Development – Software development Development – Software development platforms used to build custom cloud based platforms used to build custom cloud based applications (PAAS & SAAS) such as SalesForceapplications (PAAS & SAAS) such as SalesForce

Platform – Cloud based platforms, typically Platform – Cloud based platforms, typically provided using virtualization, such as Amazon provided using virtualization, such as Amazon ECC, Sun GridECC, Sun Grid

Hosting – Physical data centers such as those Hosting – Physical data centers such as those run by IBM, HP, NaviSite, etc.run by IBM, HP, NaviSite, etc.

心电采集和初步诊断

社区医院

社区医院

远程诊疗与监护中心

数据发送

自动诊断和辅助诊断数据共享（远程医生）咨询系统

人工诊断和治疗建议诊断结果反馈

调动社区医院空闲资源

问题心电发送

反馈治疗建议

个人用户需要更多的功能：•疾病监护 /心肺疾病•康复训练•健身指导等

社区医院也需要更多的功能：•心电、呼吸、血压•慢性病康复训练•健身指导等

医院医生也是远程种新的用户


ML: SARS Risk PredictionSARS Risk

Age

Gen

der

Blo

od P

ress

ure

Che

st X

-Ray

Pre-Hospital Attributes

Alb

umin

Blo

od p

O2

Whi

te C

ount

RB

C C

ount

In-Hospital Attributes


ML: Auto Vehicle Navigation

Steering Direction


Protein Folding


The Scale of Biomedical Data

General Procedure in SL

ML

Procedure

Predictions

Problem Definition

Data Acquisition

Model Training

Feature Analysis

EX. Pattern Classification Objective: To recognize horse in images

Procedure: Feature => Classifier => Cross+Valivation


http://www.google.com.hk/imgres?q=buildings&hl=zh-CN&newwindow=1&safe=strict&sa=X&biw=1213&bih=653&tbm=isch&prmd=ivnsm&tbnid=EYfdphRxbhq3SM:&imgrefurl=http://amazingtextures.com/textures/details.php?image_id=724&docid=takoANXw-Lex7M&w=1218&h=1438&ei=IudMToieOOeDmQXZ7vXNBg&zoom=1

http://www.google.com.hk/imgres?q=buildings&hl=zh-CN&newwindow=1&safe=strict&sa=X&biw=1213&bih=653&tbm=isch&prmd=ivnsm&tbnid=3ctvDcUgBv5wGM:&imgrefurl=http://www.mbg.be/Projects.aspx?search=IG&lang=UK&id=81&docid=48Gak7rjsGpy2M&w=467&h=467&ei=IudMToieOOeDmQXZ7vXNBg&zoom=1

http://www.google.com.hk/imgres?q=buildings&hl=zh-CN&newwindow=1&safe=strict&sa=X&biw=1213&bih=653&tbm=isch&prmd=ivnsm&tbnid=LwlYlAlpmnPlhM:&imgrefurl=http://www.pbase.com/image/45385571&docid=m_TFIPBhE3S8wM&w=800&h=656&ei=IudMToieOOeDmQXZ7vXNBg&zoom=1

http://www.google.com.hk/imgres?q=buildings&hl=zh-CN&newwindow=1&safe=strict&sa=X&biw=1213&bih=653&tbm=isch&prmd=ivnsm&tbnid=xUXbjj2uD2otDM:&imgrefurl=http://zh.wikipedia.org/zh-hans/File:Tudor_buildings_in_Tours,_France.jpg&docid=rrPdpOuq_1Zm6M&w=968&h=768&ei=IudMToieOOeDmQXZ7vXNBg&zoom=1

http://www.google.com.hk/imgres?q=buildings&hl=zh-CN&newwindow=1&safe=strict&sa=X&biw=1213&bih=653&tbm=isch&prmd=ivnsm&tbnid=17eSiDRs6myaPM:&imgrefurl=http://deasy86.blogdetik.com/index.php/2010/12/strange-fantastic-buildings-architecture/&docid=arl59RozGbcRyM&w=640&h=480&ei=IudMToieOOeDmQXZ7vXNBg&zoom=1

http://www.google.com.hk/imgres?q=buildings&hl=zh-CN&newwindow=1&safe=strict&sa=X&biw=1213&bih=653&tbm=isch&prmd=ivnsm&tbnid=xEnuf814myV9aM:&imgrefurl=http://zh.wikipedia.org/wiki/File:Singapore_Buildings.jpg&docid=vn0VXC3oRvrBMM&w=2027&h=1533&ei=IudMToieOOeDmQXZ7vXNBg&zoom=1

Classifier


Horse Horse

Non Horse Non Horse

http://www.google.com.hk/imgres?q=buildings&hl=zh-CN&newwindow=1&safe=strict&sa=X&biw=1213&bih=653&tbm=isch&prmd=ivnsm&tbnid=EYfdphRxbhq3SM:&imgrefurl=http://amazingtextures.com/textures/details.php?image_id=724&docid=takoANXw-Lex7M&w=1218&h=1438&ei=IudMToieOOeDmQXZ7vXNBg&zoom=1

http://www.google.com.hk/imgres?q=buildings&hl=zh-CN&newwindow=1&safe=strict&sa=X&biw=1213&bih=653&tbm=isch&prmd=ivnsm&tbnid=3ctvDcUgBv5wGM:&imgrefurl=http://www.mbg.be/Projects.aspx?search=IG&lang=UK&id=81&docid=48Gak7rjsGpy2M&w=467&h=467&ei=IudMToieOOeDmQXZ7vXNBg&zoom=1

http://www.google.com.hk/imgres?q=buildings&hl=zh-CN&newwindow=1&safe=strict&sa=X&biw=1213&bih=653&tbm=isch&prmd=ivnsm&tbnid=LwlYlAlpmnPlhM:&imgrefurl=http://www.pbase.com/image/45385571&docid=m_TFIPBhE3S8wM&w=800&h=656&ei=IudMToieOOeDmQXZ7vXNBg&zoom=1


Function Estimation Model The Function Estimation Model of learning

examples:– Generator (G) generates observations x (typically in Rn),

independently drawn from some fixed distribution F(x)

– Supervisor (S) labels each input x with an output value y according to some fixed distribution F(y|x)

– Learning Machine (LM) “learns” from an i.i.d. l-sample of (x,y)-pairs output from G and S, by choosing a function that best approximates S from a parameterised function class f(x,), where is in the parameter set


Function Estimation Model

Key concepts: F(x,y), an i.i.d. k-sample on F, functions f(x,) and the equivalent representation of each f using its index

xG S

LM

y

y


The loss functional (L, Q)– the error of a given function on a given example

The risk functional (R)

– the expected loss of a given function on an example drawn from F(x,y)

– the (usual concept of) generalisation error of a given function

The Problem of Risk Minimization

,,,:

,,,,:

xy zfzLzQ

xfyLfyxL

,R Q z dF z


The Problem of Risk Minimization Three Main Learning Problems

– Pattern Recognition:

– Regression Estimation:

– Density Estimation:

,,, and 1,0 xfyxfyLy 1

2,,, and xfyxfyLy

,log, and 1,0 xpxpLy


General Formulation The Goal of Learning

– Given an i.i.d. k-sample z1,…, zk drawn from a fixed distribution F(z)

– For a function class’ loss functionals Q (z ,), with in

– We wish to minimise the risk, finding a function *

R

minarg*


General Formulation The Empirical Risk Minimization (ERM) Inductive

Principle

– Define the empirical risk (sample/training error):

– Define the empirical risk minimiser:

– ERM approximates Q (z ,*) with Q (z ,k) the Remp minimiser…that is ERM approximates * with k

– Least-squares and Maximum-likelihood are realisations of ERM

k

iizQ

kR

1emp ,

1

empminarg Rk


4 Issues of Learning Theory1. Theory of consistency of learning processes

• What are (necessary and sufficient) conditions for consistency (convergence of Remp to R) of a learning process based on the ERM

Principle?

2. Non-asymptotic theory of the rate of convergence of learning processes

• How fast is the rate of convergence of a learning process?

3. Generalization ability of learning processes

• How can one control the rate of convergence (the generalization ability) of a learning process?

4. Constructing learning algorithms (i.e. the SVM)

• How can one construct algorithms that can control the generalization ability?


Change in Scientific MethodologyTRADITIONAL

Formulate hypothesis Design experiment Collect data Analyze results Review hypothesis Repeat/Publish

NEW

Design large experiments Collect large data Put data in large database Formulate hypothesis Evaluate hypothesis on

database Run limited experiments Review hypothesis Repeat/Publish


Learning & Adaptation

Any method that incorporates information from training samples in the design of a classifier employs learning.

Due to complexity of classification problems, we cannot guess the best classification decision ahead of time, we need to learn it.

Creating classifiers then involves positing some general form of model, or form of the classifier, and using examples to learn the complete classifier.


Supervised learning In supervised learning, a teacher provides a

category label for each pattern in a training set. These are then used to train a classifier which can thereafter solve similar classification problems by itself.– Such as Face Recognition, Text Classification, ……


Unsupervised learning In unsupervised learning, or clustering, there is no

explicit teacher or training data. The system forms natural clusters of input patterns and classifiers them based on clusters they belong to .

– Data Clustering, Data Quantization, Dimensional Reduction, ……


Reinforcement learning In reinforcement learning, a teacher only says to

classifier whether it is right when suggesting a category for a pattern. The teacher does not tell what the correct category is.

– Agent, Robot, ……


Classification The task of the classifier component is to use the feature

vector provided by the feature extractor to assign the object to a category.

Classification is the main topic of this course. The abstraction provided by the feature vector

representation of the input data enables the development of a largely domain-independent theory of classification.

Essentially the classifier divides the feature space into regions corresponding to different categories.


Classification

The degree of difficulty of the classification problem depends on the variability in the feature values for objects in the same category relative to the feature value variation between the categories.

Variability is natural or is due to noise.

Variability can be described through statistics leading to statistical pattern recognition.


Classification Question: How to design a classifier that can cope

with the variability in feature values? What is the best possible performance?

S(x)>=0 Class A S(x)<0 Class B

S(x)=0

Objects

X2(area)

(perimeter) X1

Object Representation in Feature Space

Noise and Biological Variations Cause Class Spread

Classification error due to class overlap

Examples User interfaces: modelling subjectivity and affect,

intelligent agents, transduction (input from camera, microphone, or fish sensor)

Recovering visual models: face recognition, model-based video, avatars

Dynamical systems: speech recognition, visual tracking, gesture recognition, virtual instruments

Probabilistic modeling: image compression, low bandwidth teleconferencing, texture synthesis

……



Course Web http://bcmi.sjtu.edu.cn/statLearnig/

Teaching Assistant: Liu Ye Email: [email protected]

Assignment To write a report on the topic you are working on,

including:– Problem definition– Model and method– Key issues to be solved– Outcome


Date post:	03-Jan-2016
Category:	Documents
Upload:	pearl-arnold
View:	221 times
Download:	0 times

Lecturer: Liqing Zhang Dept. Computer Science & Engineering, Shanghai Jiao Tong University...

Documents