+ All Categories
Home > Documents > 1 Robust Nonparametric Regression by Controlling Sparsity Gonzalo Mateos and Georgios B. Giannakis...

1 Robust Nonparametric Regression by Controlling Sparsity Gonzalo Mateos and Georgios B. Giannakis...

Date post: 17-Jan-2018
Category:
Upload: laura-adams
View: 224 times
Download: 0 times
Share this document with a friend
Description:
3 3 Our work in context Robust nonparametric regression  Huber ’ s function [Zhu et al ’ 08]  No systematic way to select thresholds Robustness and sparsity in linear (parametric) regression  Huber ’ s M-type estimator as Lasso [Fuchs ‘ 99]; contamination model  Bayesian framework [Jin-Rao ‘ 10][Mitra et al ’ 10]; rigid choice of Noteworthy applications  Load curve data cleansing [Chen et al ’ 10]  Spline-based PSD cartography [Bazerque et al ’ 09]

If you can't read please download the document

Transcript

1 Robust Nonparametric Regression by Controlling Sparsity Gonzalo Mateos and Georgios B. Giannakis ECE Department, University of Minnesota Acknowledgments: NSF grants no. CCF , EECS , May 24, 2011 2 2 Nonparametric regression If one trusts data more than any parametric model Then go nonparametric regression: lives in a (possibly -dimensional) space of smooth functions Our focus Nonparametric regression robust against outliers Robustness by controlling sparsity Ill-posed problem Workaround: regularization [Tikhonov 77], [Wahba 90] RKHS with reproducing kernel and norm Given, function estimation allows predicting Estimate unknown from a training data set 3 3 Our work in context Robust nonparametric regression Huber s function [Zhu et al 08] No systematic way to select thresholds Robustness and sparsity in linear (parametric) regression Huber s M-type estimator as Lasso [Fuchs 99]; contamination model Bayesian framework [Jin-Rao 10][Mitra et al 10]; rigid choice of Noteworthy applications Load curve data cleansing [Chen et al 10] Spline-based PSD cartography [Bazerque et al 09] 4 4 Variational LTS Least-trimmed squares (LTS) regression [Rousseeuw 87] Variational (V)LTS counterpart is the -th order statistic among Simple but intractable beyond small problems (VLTS) residuals discarded Q: How should we go about minimizing ? (VLTS) is nonconvex; existence of minimizer(s)? A : Try all subsamples of size, solve, and pick the best 5 5 Modeling outliers Nominal data obey ; outliers something else Remarks Both and are unknown If outliers sporadic, then vector is sparse! Natural (but intractable) nonconvex estimator Outlier variables s.t. outlier otherwise 6 6 VLTS as sparse regression Lagrangian form (P0) The equivalence Formally justifies the regression model and its estimator (P0) Ties sparse regression with robust estimation Tuning parameter controls sparsity in number of outliers Proposition 1: If solves (P0) with chosen s.t., then solves (VLTS) too. 7 7 Just relax! (P1) (P1) convex, and thus efficiently solved Role of sparsity controlling is central Q: Does (P1) yield robust estimates ? A: Yap! Huber estimator is a special case where (P0) is NP-hard relax 8 8 Alternating minimization (P1) jointly convex in AM solver Remarks Single Cholesky factorization of Soft-thresholding Reveals the intertwining between Outlier identification Function estimation with outlier compensated data (P1) 9 9 Lassoing outliers Enables effective methods to select Lasso solvers return entire robustification path (RP) Cross-validation (CV) fails with multiple outliers [Hampel 86] Proposition 2: as and, with Minimizers of (P1) are fully determined by w/ Alternative to AM solve Lasso [Tibshirani 94] 10 Robustification paths Lasso path of solutions is piecewise linear LARS returns whole RP [Efron 03] Same cost of a single LS fit ( ) Lasso is simple in the scalar case Coordinate descent is fast! [Friedman 07] Exploits warm starts, sparsity Other solvers: SpaRSA [Wright et al 09], SPAMS [Mairal et al 10] Coeffs. Leverage these solvers consider 2-D grid values of For each, values of 11 Selecting and Number of outliers known: from RP, obtain range of s.t.. Discard outliers (known), and use CV to determine Relies on RP and knowledge on the data model Variance of the nominal noise known: from RP, for each on the grid, obtain an entry of the sample variance matrix as The best are s.t. Variance of the nominal noise unknown: replace above with a robust estimate, e.g., median absolute deviation (MAD) 12 Nonconvex regularization Nonconvex penalty terms approximate better in (P0) Options: SCAD [Fan-Li 01], or sum-of-logs [Candes et al 08] Iterative linearization-minimization of around Remarks Initialize with, use and Bias reduction (cf. adaptive Lasso [Zou 06]) 13 Robust thin-plate splines Specialize to thin-plate splines [Duchon 77], [Wahba 80] Smoothing penalty only a seminorm in Still, Proposition 2 holds for appropriate Solution: Radial basis function Augment w/ member of the nullspace of Given, unknowns found in closed form 14 Simulation setup Training set : noisy samples of Gaussian mixture examples, i.i.d. Outliers: i.i.d. for True function Data Nominal: w/ i.i.d. ( known) 15 Robustification paths Grid parameters: grid: Outlier Inlier Paths obtained using SpaRSA [Wright et al 09] 16 Results True function Nonrobust predictions Robust predictionsRefined predictions Effectiveness in rejecting outliers is apparent 17 Generalization capability In all cases, 100% outlier identification success rate Figures of merit Training error: Test error: Nonconvex refinement leads to consistently lower 18 Load curve data cleansing Load curve: electric power consumption recorded periodically Reliable data: key to realize smart grid vision Uruguay s aggregate power consumption (MW) Deviation from nominal models (outliers) Faulty meters, communication errors Unscheduled maintenance, strikes, sporting events B-splines for load curve prediction and denoising [Chen et al 10] 19 Real data tests Nonrobust predictions Robust predictionsRefined predictions 20 Concluding summary Robust nonparametric regression VLTS as -(pseudo)norm regularized regression (NP-hard) Convex relaxation variational M-type estimator Lasso Real data tests for load curve cleansing Controlling sparsity amounts to controlling number of outliers Sparsity controlling role of is central Selection of using the Lasso robustification paths Different options dictated by available knowledge on the data model Refinement via nonconvex penalty terms Bias reduction and improved generalization capability


Recommended