Date post: | 04-Jan-2016 |
Category: |
Documents |
Upload: | miles-parker |
View: | 213 times |
Download: | 0 times |
Kernel adaptive filtering
Lecture slides for EEL6502
Spring 2011
Sohan Seth
The big picture
Adaptive filters are linear.
How do we learn (continuous) nonlinear structures?
A particular approachAssume a parametric model …
e.g. neural network
Universality: The parametric model should be able to approximate any continuous function.
Universal approximation for sufficiently large
Nonlinearly map signal to higher dimensional space and ... apply a linear
filter.
nonlinear
It’s difficultyNonlinear performance surfaceCan we learn nonlinear structure using knowledge of linear adaptive filtering?
Fix the nonlinear mapping, and use linear filtering.How do we choose the mappings?Need to guarantee universal approximation!
e.g.
A different approachFilter order is
A ‘trick’y solution
Optimal filter exists in the span of input data ***
Only the inner product matters, not the mapping
e.g
Mapping is infinite dimensional.
Top-down design
Output is a projection
Inner product and pd kernel are equivalentInner product
1.Symmetry,
2.Linearity,
3.Positive definiteness is an inner product in some space
space: Linear space with inner product
Use pd kernel to implicitly construct nonlinear mapping
Positive definite (pd) kernel
e.g.or,
How do things work?
Mercer decomposition
considering
Generalization of eigen-value decomposition in functional space.
Take a positive definite kernel
Then
can be infinite
parameters to learn
Bottom-up design
Nonlinearity is implicit in the choice of kernel.
Functional viewWe do not explicitly evaluate the mapping. But it is implicitly applied through the kernel function.
Need to remember all the input data and the coefficients
Feature space
Universality is guaranteed through the kernel.
Ridge regressionHow to find ?
Solution
Problem
How to invert an infinite dimensional matrix
Regularization ***
Online learning
LMS update rule
LMS update rule in feature space
How do we compute these?
Set to 0
Kernel-LMS
Initialize
Iterate for
is the largest eigenvalue of
Unkwown
1.Need to choose a kernel
2.Need to select step size
3.Need to store
4.No regularization ***
5. time complexity for each iteration
Functional approximation
Kernel should be universal e.g. How to
choose
Implementation details
Large
Small
Choosing best value of
2. Thumb-rules: Fast but not accurate
1. Cross validation: Accurate but time consuming
Limiting network size 1. Importance
estimationClose centers are redundant
Self-regularization : Over-fitting parameters to fit
samples How to remove it?
How does KLMS do it?
Ill-posed-nessIll-posed-ness appears due to small singular values in the autocorrelation matrix while taking inverseHow to remove it?
Solve
Tikhonov regularization
Weight the inverse of the small singular values
e.g.
Self-regularization : Well-posed-ness
How does KLMS do it?
Regularizer on the expected solution
However, large singular values might be suppressed.
More information on the course website!Username:
Password:
The stepsize acts as regularizer