Distributed Sensing and Perception via Sparse Representationyang/presentation/JHU-CIS2010.pdf ·...

Introduction Face Recognition Fast `1-Minimization Algorithms Distributed Object Recognition Conclusion

Distributed Sensing and Perception via Sparse Representation

Allen Y. [email protected]

CIS Seminar, Johns Hopkins, 2010

http://www.eecs.berkeley.edu/~yang Distributed Sensing and Perception via Sparse Representation

[email protected]

http://www.eecs.berkeley.edu/~yang


Distributed Sensing and Perception: A Comparison

Centralized Perception

Up: powerful processorsUp: unlimited memoryUp: unlimited bandwidth

Down: single modality

Distributed Perception

Down: mobile processorsDown: limited onboard memoryDown: band-limited communications

Up: distributed, multi-modality

Design an intelligent system over a network that performs better than the sum of its parts?




Distributed Sensing and Perception: A Comparison







Design an intelligent system over a network that performs better than the sum of its parts?




Challenges

1 Making real-time decisions on portable mobile devices is difficult.

2 Applications demand extremely high accuracy: 99% Precision, 99% Recall?

3 Scenarios demand the ability to reconstruct 3-D environments.




Challenges







Challenges







Smart Camera Platform: CITRIC v1

CITRIC platform Available library functions

1 Full support Intel IPP Library and OpenCV.

2 JPEG compression: 10 fps.

3 Edge detector: 3 fps.

4 Background Subtraction: 5 fps.

5 SIFT detector: 10 sec per frame.

Academic users:

Reference:

AY, et al. “CITRIC: A low-bandwidth wireless camera network platform.” (submitted) ACM Trans. Sensor Networks, 2010.




Body Sensor Platform: DexterNet

1 Body Sensor Layer (BSL)

2 Personal Network Layer (PNL)

3 Global Network Layer (GNL)

Reference:

AY, et al. “DexterNet: An open platform for heterogeneous body sensor networks and its applications.” Body Sensor Networks,

2009.




Outline

1 Robust face recognition with low-resolution, distorted, and disguised images

2 Fast `1-Minimization Algorithms

x∗ = arg minx‖x‖1 subj. to b = Ax.

Augmented Lagrange Multiplier

3 Distributed object recognition using a camera network




Outline









Outline









Robust Face Recognition




Classification of Mixture Subspace Model

1 Face-subspace model [Belhumeur et al. ’97, Basri & Jacobs ’03]

Assume b belongs to Class i in K classes.

b = αi,1vi,1 + αi,2vi,2 + · · ·+ αi,n1vi,ni

,= Aiαi .

2 Nevertheless, Class i is the unknown label we need to solve:

Sparse representation b = [A1,A2, · · · ,AK ]

24 α1α2

...αK

35 = Ax.

3 x∗ = [ 0 ··· 0 αTi 0 ··· 0 ]T ∈ Rn.

Sparse representation x∗ encodes membership!








,= Aiαi .



24 α1α2

...αK

35 = Ax.

3 x∗ = [ 0 ··· 0 αTi 0 ··· 0 ]T ∈ Rn.









,= Aiαi .



24 α1α2

...αK

35 = Ax.

3 x∗ = [ 0 ··· 0 αTi 0 ··· 0 ]T ∈ Rn.





Image Corruption

1 Sparse representation + sparse error

b = Ax + e

2 Occlusion compensation:

b =`A | I

´„xe

«= Bw




Image Corruption

1 Sparse representation + sparse error

b = Ax + e

2 Occlusion compensation:

b =`A | I

´„xe

«= Bw




Performance on the AR database

Reference:

AY, et al. Robust face recognition via sparse representation. IEEE PAMI, 2009.




Face Alignment

Seek a 2-D transformationb ◦ τi = Ai x + e. (1)

Although ‖x‖1 is no longer penalized, the problem becomes nonlinear.

Linear approximation:b ◦ τi +∇τ (b ◦ τi ) ·∆τi ≈ Ai x + e. (2)

Convert to a linear equation:

b(k)i = [Ai ,−J

(k)i ]w + e, (3)

where w.

= [xT ,∆τTi ]T .




Face Alignment





b(k)i = [Ai ,−J

(k)i ]w + e, (3)

where w.

= [xT ,∆τTi ]T .




Face Alignment





b(k)i = [Ai ,−J

(k)i ]w + e, (3)

where w.

= [xT ,∆τTi ]T .




Face Alignment





b(k)i = [Ai ,−J

(k)i ]w + e, (3)

where w.

= [xT ,∆τTi ]T .




Demo I: Misalignment & Corruption Compensation

Alignment DemoReference:

Wagner, et al. Towards a Practical Face Recognition System: Robust Registration and Illumination via Sparse Representation.

CVPR, 2009.




Question: How to effectively estimate HD sparse signals?

“Black gold” age [Claerbout & Muir 1973, Taylor, Banks & McCoy 1979]

Figure: Deconvolution of spike train.

Basis pursuit [Chen-Donoho 1999]:

x∗ = arg min ‖x‖1, subject to b = Ax

The Lasso (least absolute shrinkage and selection operator) [Tibshirani 1996]

x∗ = arg min ‖b− Ax‖2, subject to ‖x‖1 ≤ k




`0/`1 Equivalence Relationship

`0-Minimization over an underdetermined system (NP-Hard)


‖ · ‖0 simply counts the number of nonzero terms.

b=Axl-0 ball

`1-Minimization (Linear Program) [Candes & Tao 2006, Donoho 2006]


‖x‖1 = |x1|+ |x2|+ · · ·+ |xn|.

b=Axl-0 ball

l-1 ball




`0/`1 Equivalence Relationship

`0-Minimization over an underdetermined system (NP-Hard)


‖ · ‖0 simply counts the number of nonzero terms.

b=Axl-0 ball

`1-Minimization (Linear Program) [Candes & Tao 2006, Donoho 2006]


‖x‖1 = |x1|+ |x2|+ · · ·+ |xn|.

b=Axl-0 ball

l-1 ball




`1-Minimization via Linear Programming

Using interior-point methods [Karmarkar ’84]

Log-Barrier: minx

1T x− µnX

i=1

log xi , subj. to Ax = b, x ≥ 0. (4)

Using the Karush-Kuhn-Tucker (KKT) conditions

1− µX−11− AT y = 0. (5)

where x ≥ 0 are the primal variables, and y are the dual variables.

Update by solving a linear system with O(n3) [Monteiro & Adler ’89]

Z (k)∆x + X (k)∆z = µ1− X (k)z(k),A∆x = 0,

AT ∆y + ∆z = 0,(6)




Fast `1-minimization is still a difficult problem!!

Interior-point methods are very expensive in HD space.




References

1 Primal-Dual Interior-Point MethodsLog-Barrier [Frisch 1955, Karmarkar 1984, Megiddo 1989, Monteiro-Adler 1989,Kojima-Megiddo-Mizuno 1993]

2 Homotopy Methods:Homotopy [Osborne-Presnell-Turlach 2000, Malioutov-Cetin-Willsky 2005, Donoho-Tsaig 2006]Polytope Faces Pursuit (PFP) [Plumbley 2006]Least Angle Regression (LARS) [Efron-Hastie-Johnstone-Tibshirani 2004]

3 Gradient Projection MethodsGradient Projection Sparse Representation (GPSR) [Figueiredo-Nowak-Wright 2007]Truncated Newton Interior-Point Method (TNIPM) [Kim-Koh-Lustig-Boyd-Gorinevsky 2007]

4 Iterative Thresholding MethodsSoft Thresholding [Donoho 1995]Sparse Reconstruction by Separable Approximation (SpaRSA) [Wright-Nowak-Figueiredo 2008]

5 Proximal Gradient Methods [Nesterov 1983, Nesterov 2007]

FISTA [Beck-Teboulle 2009]Nesterov’s Method (NESTA) [Becker-Bobin-Candes 2009]

6 Augmented Lagrange Multiplier Methods [Yang-Zhang 2009, AY et al 2010]

YALL1 [Yang-Zhang 2009]Primal ALM, Dual ALM [AY et al 2010]

References:

AY, et al., A review of fast `1-minimization algorithms for robust face recognition. Submitted to SIAM Imaging Sciences, 2010.




Iterative Soft-Thresholding (IST) Methods

Objective: x∗ = arg min ‖x‖1 subj. to ‖e‖ = ‖b− Ax‖ < ε

F (x).

=1

2‖b− Ax‖2

2 + λ‖x‖1 = f (x) + λg(x)

IST iteratively approximate the composite objective function

x(k+1) ≈ arg minx{f (x(k)) + (x− x(k))T∇f (x(k)) + ∇2f (x(k))2‖x− x(k)‖2

2 + λg(x)}= arg minx{(x− x(k))T∇f (x(k)) + α(k)I

2‖x− x(k)‖2

2 + λg(x)}

where the hessian ∇2f (x) is approximated by a diagonal matrix αI .

A closed-form solution exists element-wise [Donoho ’95, Wright et al. ’08]

x(k+1)i = arg min

xi{

(xi − u(k)i )2

2+λ|xi |α(k)} = soft(u

(k)i ,

λ

α(k))






F (x).

=1

2‖b− Ax‖2

2 + λ‖x‖1 = f (x) + λg(x)




2‖x− x(k)‖2

2 + λg(x)}



x(k+1)i = arg min

xi{

(xi − u(k)i )2


(k)i ,

λ

α(k))






F (x).

=1

2‖b− Ax‖2

2 + λ‖x‖1 = f (x) + λg(x)




2‖x− x(k)‖2

2 + λg(x)}



x(k+1)i = arg min

xi{

(xi − u(k)i )2


(k)i ,

λ

α(k))





ALM considers an augmented Lagrange function

Lµ(x, y) = ‖x‖1 + 〈y, b− Ax〉+µ

2‖b− Ax‖2

2,

where y are the Lagrange multipliers for the constraint b = Ax.

It can be shown if y∗(µ) is optimal [Hestenes ’69, Powell ’69, Bertsekas ’03]

x∗(µ) = arg minx

Lµ(x, y∗);

x∗∗ = limµ→∞

x∗(µ)

Iteratively update x, y, and µ with O(dn):8<: xk+1 = arg minx Lµk (x, yk ),yk+1 = yk + µk (b− Axk+1),µk+1 → ∞.

.






Lµ(x, y) = ‖x‖1 + 〈y, b− Ax〉+µ

2‖b− Ax‖2

2,



x∗(µ) = arg minx

Lµ(x, y∗);


x∗(µ)


.






Lµ(x, y) = ‖x‖1 + 〈y, b− Ax〉+µ

2‖b− Ax‖2

2,



x∗(µ) = arg minx

Lµ(x, y∗);


x∗(µ)


.




Demo II: Speed of ALM vs Interior-Point

Table: Source signal in 1000-D: sparsity = 200; random projection = 600-D.

Algorithm Estimate Runtime

PDIPA 63 s

ALM 0.16 s




Distributed Object Recognition: Problem Statement

1 L camera sensors observe a single object in 3-D.

2 The relative positions between cameras are unknown, cross-sensor communication isprohibited.

3 On each camera, seek an encoding function for a high-dim, sparse xi (SIFT histogram)

f : xi ∈ RD 7→ bi ∈ Rd

4 At the base station, upon receiving b1, b2, · · · , bL, simultaneously recover

x1, x2, · · · , xL,

and classify the object class in space.




Key Observations: Scale Invariant Feature Transform

(a) Histogram 1 (b) Histogram 2

All SIFT histograms are nonnegative and sparse.

Multiple-view histograms share joint sparse patterns.

Reference:

AY, et al. Multiple-view object recognition in smart camera networks. Springer, 2010.




Key Observations: Scale Invariant Feature Transform

(a) Histogram 1 (b) Histogram 2

All SIFT histograms are nonnegative and sparse.

Multiple-view histograms share joint sparse patterns.

Reference:

AY, et al. Multiple-view object recognition in smart camera networks. Springer, 2010.




The System

bi = Axi , where x is assumed sparse.




Joint Sparsity Model

Definition: Joint Sparsity Model [Baron et al. 2005]

x1 = xc + z1,...

xL = xc + zL.

xc is called the common component, and zi is called an innovation.

Recovery of the JS model 24 b1

...bL

35 =

24 A1 A1 0 ··· 0

.... . .

. . .AL 0 ··· 0 AL

35264

xcz1

...zL

375⇔ b′ = A′x′ ∈ RdL.

1 New histogram vector remains nonnegative and sparse.

2 Joint sparsity xc is automatically determined by `1-min: No prior training, no assumption about fixingcameras and calibration.






x1 = xc + z1,...

xL = xc + zL.



...bL

35 =

24 A1 A1 0 ··· 0

.... . .

. . .AL 0 ··· 0 AL

35264

xcz1

...zL

375⇔ b′ = A′x′ ∈ RdL.








x1 = xc + z1,...

xL = xc + zL.



...bL

35 =

24 A1 A1 0 ··· 0

.... . .

. . .AL 0 ··· 0 AL

35264

xcz1

...zL

375⇔ b′ = A′x′ ∈ RdL.






Berkeley Multiview Wireless (BMW) Database

20 landmarks at UC Berkeley.

16 different vantage points (large baseline); five images at one location (short baseline).

Low-quality images: low resolution, inaccurate focal length, dusty lenses.




Experiment: Accuracy on BMW Database

Better recognition rate is achieved using multiple views but less bandwidth!

Reference:

AY, et al. Towards an efficient distributed object recognition system in wireless smart camera networks. in Information Fusion,

2010.




Experiment: Accuracy on BMW Database

Better recognition rate is achieved using multiple views but less bandwidth!

Reference:

AY, et al. Towards an efficient distributed object recognition system in wireless smart camera networks. in Information Fusion,

2010.




A distributed network shall be greater than the sum of its parts!







Our approach to the unique challenges:

1 How to design real-time recognition systems in sensor networks?A: Efficient numerical solvers plus new computational models: parallel & cloud computing.

2 How to achieve extremely high accuracy?A: Pay attention to special structures in HD data imposed by applications; Simple solutionsare often the best (e.g., sparse representation).

3 How to provide the ability to reconstruct 3-D using mobile smart cameras?A: Don’t just label the images; Take advantage of available information in 3-D geometry(e.g., joint sparsity).














































Sparse Representation is “the Next Wave”?

Single-Pixel Camera for Deep-Space Imaging [Baraniuk 2008]

Background Subtraction [Chellappa 2008]

MRI Imaging [Lustig 2007]

Robust PCA [Candes 2009]




References

Acknowledgments

UC BerkeleyDr S. Sastry, Dr R. Bajcsy,Dr E. Seto, Dr T. Darrell,Dr J. Malik, N. Naikal, V. Shia,P. Yan.

Univ. IllinoisA. Ganesh, Z. Zhou, A. Wagner

MSR AsiaDr Y. Ma, Dr J. Wright

Funding SupportARO MURI: Heterogeneous Sensor Networks in Urban Terrains

ARL: Micro Autonomous Systems and Technology

PatentsYang, et al. “Recognition via High-Dimensional Data Classification.” US & China Patent, 2009.

Yang, et al. “System for Detection of Body Motion.” US Patent, 2010.

PublicationsWright, Yang, Ganesh, Sastry, Ma. “Robust face recognition via sparse representation,” IEEE PAMI, 2009.

Yang, Gastpar, Bajcsy, Sastry. “Distributed Sensor Perception via Sparse Representation.” Proceedings of IEEE, 2010.

Naikal, Yang, Sastry. “Towards an efficient distributed object recognition system in wireless smart camera networks.”Information Fusion, 2010.

Yang, Ganesh, Zhou, Sastry, Ma.“A review of fast `1-minimization algorithms in robust face recognition”, arXiv, 2010.

Ganesh, Ma, Wagner, Wright, Yang, “Robust face recognition by sparse representation,” (submitted) Cambridge Press,2010.




Feasibility and Uniqueness: `0-Minimization

Spark Condition

Spark(A): smallest number of columns that are linearly dependent

1 Example I: Identity matrix I ∈ Rd×d , Spark(A) = d+1;

2 Example II:

»1 0 1 00 1 0 1

–, Spark(A) = 2;

3 Example III: Random matrix [v1, v2, · · · , vn] ∈ Rd×n, Spark(A) = d+1 (with high probability);

Sparse signal x can be uniquely recovered by `0-min if

‖x‖0 <Spark(A)

2

Proof.

1 Suppose x1 6= x2 both satisfy the spark condition, and b = Ax1, b = Ax2.

2 A(x1 − x2).= Ay = b− b = 0.

3 But ‖y‖0 <Spark(A)

2 +Spark(A)

2 = Spark(A). Contradiction.

Estimating Spark(A) is as expensive as `0-min itself!




Feasibility and Uniqueness: `1-Minimization

k-Neighborliness Condition

b

Define cross polytope C and quotient polytope P such that P = AC .

x is k-sparse ⇔ x lie in a unique (k − 1)-face of C .

Necessary and Sufficient:1 If the (k − 1)-face where x lies maps to a face of P, then `1/`0 holds for this specific x.2 If all (k − 1)-faces of C map to the faces of P on the boundary, `1/`0 holds for all k-sparse signals.



Date post:	31-Jul-2020
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

Distributed Sensing and Perception via Sparse Representationyang/presentation/JHU-CIS2010.pdf ·...

Documents