+ All Categories
Home > Documents > IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL...

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL...

Date post: 03-Oct-2020
Category:
Upload: others
View: 6 times
Download: 0 times
Share this document with a friend
13
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson K. Fletcher, Member, IEEE, Sundeep Rangan, Member, IEEE, and Vivek K Goyal, Senior Member, IEEE Abstract—This paper considers the problem of detecting the support (sparsity pattern) of a sparse vector from random noisy measurements. Conditional power of a component of the sparse vector is dened as the energy conditioned on the component being nonzero. Analysis of a simplied version of orthogonal matching pursuit (OMP) called sequential OMP (SequOMP) demonstrates the importance of knowledge of the rankings of con- ditional powers. When the simple SequOMP algorithm is applied to components in nonincreasing order of conditional power, the detrimental effect of dynamic range on thresholding performance is eliminated. Furthermore, under the most favorable conditional powers, the performance of SequOMP approaches maximum likelihood performance at high signal-to-noise ratio. Index Terms—Compressed sensing, convex optimization, lasso, maximum likelihood estimation, orthogonal matching pursuit, random matrices, sparse Bayesian learning, sparsity, thresholding. I. INTRODUCTION S ETS of signals that are sparse or approximately sparse with respect to some basis are ubiquitous because signal mod- eling often has the implicit goal of nding such bases. Using a sparsifying basis, a simple abstraction that applies in many set- tings is for (1) to be observed, where is known, is the un- known sparse signal of interest, and is random noise. When , constraints or prior information about are es- sential to both estimation (nding vector such that is small) and detection (nding index set equal to the sup- port of ). The focus of this paper is on the use of magnitude rank information on —in addition to sparsity—in the support detection problem. We show that certain scaling laws relating the problem dimensions and the noise level are changed dramat- ically by exploiting the rank information in a simple sequential detection algorithm. The simplicity of the observation model (1) belies the va- riety of questions that can be posed and the difculty of precise analysis. In general, the performance of any algorithm is a com- plicated function of , and the distribution of . To enable Manuscript received November 02, 2011; revised April 24, 2012; accepted June 18, 2012. Date of publication July 16, 2012; date of current version Oc- tober 09, 2012. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Namrata Vaswani. The work of V. K Goyal was supported in part by the National Science Foundation under CA- REER Grant No. 0643836. This work was presented in part at the IEEE Inter- national Symposium on Information Theory, Seoul, Korea, June–July 2009 A. K. Fletcher is with the Department of Electrical Engineering, University of California, Santa Cruz, CA 95064 USA (e-mail: a[email protected]). S. Rangan is with the Department of Electrical and Computer Engineering, Polytechnic Institute of New York University, Brooklyn, NY 11201 USA (e-mail: [email protected]). V. K Goyal is with the Research Laboratory of Electronics, Massachusetts In- stitute of Technology, Cambridge, MA 02139 USA (e-mail: [email protected]). Digital Object Identier 10.1109/TSP.2012.2208957 results that show the qualitative behavior in terms of problem dimensions and a few other parameters, we assume the entries of are i.i.d. normal, and we describe by its energy and its smallest-magnitude nonzero entry. We consider a partially-random signal model (2) where components of vector are i.i.d. Bernoulli random vari- ables with and is a nonrandom parameter vector with all nonzero entries. The value represents the conditional power of the component in the event that . We consider the problem where the estimator knows neither nor , but may know the order or rank of the conditional powers. In this case, the estimator can, for example, sort the components of in an order such that (3) A stylized application in which the conditional ranks (and furthermore approximate conditional powers) can be known is random access communication as described in [1]. Also, par- tial orders of conditional powers can be known in some appli- cations because of the magnitude variation of wavelet coef- cients across scale [2]. Along with being motivated by these ap- plications, we aim to provide a new theoretical grounding for a known empirical phenomenon: orthogonal matching pursuit (OMP) and sparse Bayesian learning (see references below) ex- hibit improvements in detection performance when the nonzero entries of the signal have higher dynamic range. A. Main Contribution Rank information is extremely valuable in support detec- tion. Abstracting from the applications above, we show that when conditional rank information is available, a very simple detector, termed sequential orthogonal matching pursuit (SequOMP), can be effective. The SequOMP algorithm is a one-pass version of the well-known OMP algorithm. Similar to several works in sparsity pattern recovery [3]–[5], we ana- lyze the performance of SequOMP by estimating a scaling on the minimum number of measurements to asymptotically reliably detect the sparsity pattern (support) of in the limit of large random matrices . Although the SequOMP algorithm is extremely simple, we show: When the power orders are known and the signal-to-noise ratio (SNR) is high, the SequOMP algorithm exhibits a scaling in the minimum number of measurements for spar- sity pattern recovery that is within a constant factor of the more sophisticated lasso and OMP algorithms. In par- ticular, SequOMP exhibits a resistance to large dynamic ranges, which is one of the main motivations for using lasso and OMP. When the power prole can be optimized, SequOMP can achieve measurement scaling for sparsity pattern recovery 1053-587X/$31.00 © 2012 IEEE
Transcript
Page 1: IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919

Ranked Sparse Signal Support DetectionAlyson K. Fletcher, Member, IEEE, Sundeep Rangan, Member, IEEE, and Vivek K Goyal, Senior Member, IEEE

Abstract—This paper considers the problem of detecting thesupport (sparsity pattern) of a sparse vector from random noisymeasurements. Conditional power of a component of the sparsevector is defined as the energy conditioned on the componentbeing nonzero. Analysis of a simplified version of orthogonalmatching pursuit (OMP) called sequential OMP (SequOMP)demonstrates the importance of knowledge of the rankings of con-ditional powers. When the simple SequOMP algorithm is appliedto components in nonincreasing order of conditional power, thedetrimental effect of dynamic range on thresholding performanceis eliminated. Furthermore, under the most favorable conditionalpowers, the performance of SequOMP approaches maximumlikelihood performance at high signal-to-noise ratio.

Index Terms—Compressed sensing, convex optimization, lasso,maximum likelihood estimation, orthogonal matching pursuit,randommatrices, sparse Bayesian learning, sparsity, thresholding.

I. INTRODUCTION

S ETS of signals that are sparse or approximately sparse withrespect to some basis are ubiquitous because signal mod-

eling often has the implicit goal of finding such bases. Using asparsifying basis, a simple abstraction that applies in many set-tings is for

(1)

to be observed, where is known, is the un-known sparse signal of interest, and is random noise.When , constraints or prior information about are es-sential to both estimation (finding vector such thatis small) and detection (finding index set equal to the sup-port of ). The focus of this paper is on the use of magnituderank information on —in addition to sparsity—in the supportdetection problem. We show that certain scaling laws relatingthe problem dimensions and the noise level are changed dramat-ically by exploiting the rank information in a simple sequentialdetection algorithm.The simplicity of the observation model (1) belies the va-

riety of questions that can be posed and the difficulty of preciseanalysis. In general, the performance of any algorithm is a com-plicated function of , and the distribution of . To enable

Manuscript received November 02, 2011; revised April 24, 2012; acceptedJune 18, 2012. Date of publication July 16, 2012; date of current version Oc-tober 09, 2012. The associate editor coordinating the review of this manuscriptand approving it for publication was Prof. Namrata Vaswani. The work of V.K Goyal was supported in part by the National Science Foundation under CA-REER Grant No. 0643836. This work was presented in part at the IEEE Inter-national Symposium on Information Theory, Seoul, Korea, June–July 2009A. K. Fletcher is with the Department of Electrical Engineering, University

of California, Santa Cruz, CA 95064 USA (e-mail: [email protected]).S. Rangan is with the Department of Electrical and Computer Engineering,

Polytechnic Institute of New York University, Brooklyn, NY 11201 USA(e-mail: [email protected]).V. K Goyal is with the Research Laboratory of Electronics, Massachusetts In-

stitute of Technology, Cambridge, MA 02139 USA (e-mail: [email protected]).Digital Object Identifier 10.1109/TSP.2012.2208957

results that show the qualitative behavior in terms of problemdimensions and a few other parameters, we assume the entriesof are i.i.d. normal, and we describe by its energy and itssmallest-magnitude nonzero entry.We consider a partially-random signal model

(2)

where components of vector are i.i.d. Bernoulli random vari-ables with and is anonrandom parameter vector with all nonzero entries. The valuerepresents the conditional power of the component in the

event that . We consider the problem where the estimatorknows neither nor , but may know the order or rank of theconditional powers. In this case, the estimator can, for example,sort the components of in an order such that

(3)

A stylized application in which the conditional ranks (andfurthermore approximate conditional powers) can be known israndom access communication as described in [1]. Also, par-tial orders of conditional powers can be known in some appli-cations because of the magnitude variation of wavelet coeffi-cients across scale [2]. Along with being motivated by these ap-plications, we aim to provide a new theoretical grounding fora known empirical phenomenon: orthogonal matching pursuit(OMP) and sparse Bayesian learning (see references below) ex-hibit improvements in detection performance when the nonzeroentries of the signal have higher dynamic range.

A. Main Contribution

Rank information is extremely valuable in support detec-tion. Abstracting from the applications above, we show thatwhen conditional rank information is available, a very simpledetector, termed sequential orthogonal matching pursuit(SequOMP), can be effective. The SequOMP algorithm is aone-pass version of the well-known OMP algorithm. Similarto several works in sparsity pattern recovery [3]–[5], we ana-lyze the performance of SequOMP by estimating a scaling onthe minimum number of measurements to asymptoticallyreliably detect the sparsity pattern (support) of in the limit oflarge random matrices . Although the SequOMP algorithmis extremely simple, we show:• When the power orders are known and the signal-to-noiseratio (SNR) is high, the SequOMP algorithm exhibits ascaling in the minimum number of measurements for spar-sity pattern recovery that is within a constant factor ofthe more sophisticated lasso and OMP algorithms. In par-ticular, SequOMP exhibits a resistance to large dynamicranges, which is one of themainmotivations for using lassoand OMP.

• When the power profile can be optimized, SequOMP canachieve measurement scaling for sparsity pattern recovery

1053-587X/$31.00 © 2012 IEEE

Page 2: IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson

5920 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012

TABLE ISUMMARY OF RESULTS ON MEASUREMENT SCALINGS FOR ASYMPTOTIC RELIABLE DETECTION FOR VARIOUS DETECTION ALGORITHMS. ONLY LEADING TERMS

ARE SHOWN. SEE BODY FOR DEFINITIONS AND TECHNICAL LIMITATIONS

that is within a constant factor of maximum likelihood(ML) detection. This scaling is better than the best knownsufficient conditions for lasso and OMP.

The results are not meant to suggest that SequOMP is a goodalgorithm; other algorithms such as OMP can perform dramati-cally better. The point is to concretely and provably demonstratethe value of conditional rank information.

B. Related Work

Under an i.i.d. Gaussian assumption on , maximum likeli-hood estimation of under a sparsity constraint is equivalentto finding sparse such that is minimized. This iscalled optimal sparse approximation of using dictionary ,and it is NP-hard [6]. Several greedy heuristics (matching pur-suit [7] and its variants with orthogonalization [8]–[10] and it-erative refinement [11], [12]) and convex relaxations (basis pur-suit [13], lasso [14], Dantzig selector [15], and others) have beendeveloped for sparse approximation, and under certain condi-tions on and they give optimal or near-optimal performance[16]–[18]. Results showing that near-optimal estimation of isobtained with convex relaxations, pointwise over compressibleand with high probability over some random ensemble for ,

form the heart of the compressed sensing literature [19]–[21].Under a probabilistic model for and certain additional as-sumptions, exact asymptotic performances of several estimatorsare known [22].Our interest is in recovery or detection of the support (or spar-

sity pattern) of rather than the estimation of . In the noiselesscase of , optimal estimation of can yield undercertain conditions on ; estimation and detection then coincide,and some papers cited above and notably [23] contain relevantresults. In the general noisy case, direct analysis of the detectionproblem has yielded much sharper results.A standard formulation is to treat as a nonrandom parameter

vector and as either nonrandom with weight or random with

a uniform distribution over the weight- vectors. The minimumprobability of detection error is then attained withML detection.Sufficient conditions for the success of ML detection are due toWainwright [3]; necessary conditions based on channel capacitywere given by several authors [24]–[27], and conditions morestringent in many regimes and a comparison of results appearin [5]. Necessary and sufficient conditions for lasso were deter-mined by Wainwright [4]. Sufficient conditions for orthogonalmatching pursuit (OMP) were given by Tropp and Gilbert [28]and improved by Fletcher and Rangan [29]. Even simpler thanOMP is a thresholding algorithm analyzed in a noiseless settingin [30] and with noise in [5]. These results are summarized inTable I, using terminology defined formally in Section II. Whilethresholded backprojection is unsophisticated from a signal pro-cessing point of view, it is simple and commonly used in a va-riety of fields. Improvements relative to this are needed to justifythe use of methods with higher complexity.Some of our results depend on knowledge of the ordering of

conditional powers of entries of . Several earlier works haveintroduced other models of partial information about signal sup-port or varying likelihoods of indexes appearing in the support[31]–[33]. Statistical dependencies between components ofcan be exploited very efficiently using a recent extension [34] ofthe generalized approximate message passing framework [35].

C. Paper Organization

The remainder of the paper is organized as follows. The set-ting is formalized in Section II. In particular, we define all thekey problem parameters. Common algorithms and previous re-sults on their performances are then presented in Section III.We will see that there is a potentially-large performance gap be-tween the simplest thresholding algorithm and the optimal MLdetection, depending on the signal-to-noise ratio (SNR) and thedynamic range of . Section IV presents a new detection al-gorithm, sequential orthogonal matching pursuit (SequOMP),

Page 3: IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson

FLETCHER et al.: RANKED SPARSE SIGNAL SUPPORT DETECTION 5921

that exploits knowledge of conditional ranks. Numerical ex-periments are reported in Section V. Conclusions are given inSection VI, and proofs are relegated to the Appendix.

II. PROBLEM FORMULATION

In the observation model , let andhave i.i.d. entries. This is a normalization

under which the ratio of conditional total signal energy to totalnoise energy

(4)

simplifies to

(5)

This is a random variable because is a random vector.Let denote the support

of . Using signal model (2),. The sparsity level of is .An estimator produces an estimate of based

on the observed noisy vector . Given an estimator, its proba-bility of error1 is taken with respect torandomness in , noise vector , and signal . Our interest isin relating the scaling of problem parameters with the success ofvarious algorithms. For this, we define the following criterion.Definition 1: Suppose that we are given deterministic se-

quences , and that varywith . For a given detection algorithm , the proba-bility of error is some function of . We say that the de-tection algorithm achieves asymptotic reliable detection when

.We will see that two key factors influence the ability to detect. The first is the total SNR defined above. The second is

what we call the minimum-to-average ratio

(6)

Like , this is a random variable. Since has ele-ments, is the average of . Therefore,

with the upper limit occurring when all thenonzero entries of have the same magnitude.Finally, we define the minimum component SNR to be

(7)

where is the th column of and the second equality followsfrom the normalization of chosen for and . The randomvariable has a natural interpretation: The numeratoris the signal power due to the smallest nonzero component in, while the denominator is the total noise power. The ratio

thus represents the contribution to the SNR fromthe smallest nonzero component of . Observe that (5) and (6)show

(8)

1An alternative to this definition of could be to allow a nonzero fractionof detection errors [26], [27].

We will be interested in estimators that exploit minimalprior knowledge on : either only knowledge of sparsity level(through or ) or also knowledge of the conditional ranks(through the imposition of (3)). In particular, full knowledgeof would change the problem considerably because the finitenumber of possibilities for could be exploited.

III. COMMON DETECTION METHODS

In this section, we review several asymptotic analyses fordetection of sparse signal support. These previous results holdpointwise over sequences of problems of increasing dimension, i.e., treating as an unknown deterministic quantity. Thatmakes these results stronger than results that are limited to themodel (2) where the s are i.i.d. Bernoulli variables. To reflectthe pointwise validity of these results, they are stated in termsof deterministic sequences , andthat depend on dimension and are arbitrary aside from satis-fying and the definitions of the previous section. Tosimplify the notation, we drop the dependence of and on, and and on . When the results aretabulated for comparison with each other and with the results ofSection IV, we replace with ; this specializes the results tothe model (2).

A. Optimal Detection With No Noise

To understand the limits of detection, it is useful to first con-sider the minimum number of measurements when there is nonoise. Suppose that is known to the detector. With no noise,the observed vector is , which will belong to one of

subspaces spanned by columns of . If ,then these subspaces will be distinct with probability 1. Thus,an exhaustive search through the subspaces will reveal whichsubspace belongs to and thus determine the support .This shows that with no noise and no computational limits, thescaling in measurements of

(9)

is sufficient for asymptotic reliable detection.Conversely, if no prior information is known at the detector

other than being -sparse, then the condition (9) is also neces-sary. If , then for almost all , any columns of span. Consequently, any observed vector is consistent

with any support of weight . Thus, the support cannot be de-termined without further prior information on the signal .Note that we are considering correct detection with proba-

bility 1 (over the random choice of ) for a single -sparse .It is elementary to show that correct detection with probability 1(again over the random choice of ) for all -sparse requires

.

B. ML Detection With Noise

Now suppose there is noise. Since is an unknown determin-istic quantity, the probability of error in detecting the support isminimized by maximum likelihood (ML) detection. Since thenoise is Gaussian, the ML detector finds the -dimensionalsubspace spanned by columns of containing the maximumenergy of .

Page 4: IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson

5922 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012

The ML estimator was first analyzed by Wainwright [3]. Heshows that there exists a constant such that if

(10)

then ML will asymptotically detect the correct support. Theequivalence of the two expressions in (10) is due to (8). Also,[5, Thm. 1] (generalized in [36, Thm. 1]) shows that, for any

, the condition

(11)

is necessary. Observe that when , the lowerbound (11) approaches , matching the noise-free case (9)as expected.These necessary and sufficient conditions for ML appear in

Table I with smaller terms and the infinitesimal omitted forsimplicity.

C. Thresholding

The simplest method to detect the support is to use a thresh-olding rule of the form

(12)

where is a threshold parameter and is the correlationcoefficient:

Thresholding has been analyzed in [5], [30], [37]. In particular,[5, Thm. 2] is the following: Suppose

(13)

where and

(14)

Then there exists a sequence of detection thresholdssuch that achieves asymptotic reliable detection of the sup-port. As before, the equivalence of the two expressions in (13)is due to (8).Comparing the sufficient condition (13) for thresholding with

the necessary condition (11), we see two distinct problems withthresholding:• Constant offset: The scaling (13) for thresholding showsa factor instead of in (11). It is easilyverified that, for

(15)

so this difference in factors alone could require that thresh-olding use up to 4 times more measurements than ML forasymptotic reliable detection.Combining the inequality (15) with (13), we see that themore stringent, but simpler, condition

(16)

is also sufficient for asymptotic reliable detection withthresholding. This simpler condition is shown in Table I,where we have omitted the infinitesimal quantity tosimplify the table entry.

• SNR saturation: In addition to theoffset, thresholding also requires a factor of moremeasurements than ML. This factor has a naturalinterpretation as intrinsic interference:When detecting anyone component of the vector , thresholding sees the en-ergy from the other components of the signal as in-terference. This interference is distinct from the additivenoise , and it increases the effective noise by a factor of

.The intrinsic interference results in a large performancegap at high SNRs. In particular, as , (13) reducesto

(17)

In contrast, ML may be able to succeed with a scalingfor high SNRs.

D. Lasso and OMP Detection

While ML has clear advantages over thresholding, it is notcomputationally tractable for large problems. One practicalmethod is lasso [14], also called basis pursuit denoising [13].The lasso estimate of is obtained by solving the convexoptimization

where is an algorithm parameter that encourages sparsityin the solution . The nonzero components of can then be usedas an estimate of .Wainwright [4] has given necessary and sufficient conditions

for asymptotic reliable detection with lasso. Partly because offreedom in the choice of a sequence of parameters , thefinite SNR results are difficult to interpret. Under certain condi-tions with SNR growing unboundedly with , matching neces-sary and sufficient conditions can be found. Specifically, ifand , with , the scaling

(18)

is both necessary and sufficient for asymptotic reliable detec-tion.Another common approach to support detection is the OMP

algorithm [8]–[10]. This was analyzed by Tropp and Gilbert[28] in a setting with no noise. This was generalized to thepresent setting with noise by Fletcher and Rangan [29]. The re-sult is very similar to condition (18): If and , with

Page 5: IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson

FLETCHER et al.: RANKED SPARSE SIGNAL SUPPORT DETECTION 5923

, a sufficient condition for asymptotic reliablerecovery is

(19)

The main result of [29] also allows uncertainty in .The conditions (18) and (19) are both shown in Table I. As

usual, the table entries are simplified by including only theleading terms.The lasso and OMP scaling laws, (18) and (19), can be com-

pared with the high SNR limit for the thresholding scaling lawin (17). This comparison shows the following:• Removal of the constant offset: The factor in thethresholding expression is replaced by a factorin the lasso and OMP scaling laws. Similar to the discus-sion above, this implies that lasso and OMP could requireup to 4 times fewer measurements than thresholding.

• Dynamic range: In addition, both the lasso and OMPmethods do not have a dependence on MAR. This gaincan be large when there is high dynamic range, i.e., MARis near zero.

• Limits at high SNR: We also see from (18) and (19) thatboth lasso and OMP are unable to achieve the scaling

that may be achievable with ML at highSNR. Instead, both lasso and OMP have the scaling

, similar to the minimum scalingpossible with thresholding.

E. Other Sparsity Detection Algorithms

Recent interest in compressed sensing has led to a plethoraof algorithms beyond OMP and lasso. Empirical evidence sug-gests that the most promising algorithms for support detectionare the sparse Bayesian learning methods developed in the ma-chine learning community [38] and introduced into signal pro-cessing applications in [39], with related work in [40]. Unfor-tunately, a comprehensive summary of these algorithms is farbeyond the scope of this paper. Our interest is not in finding theoptimal algorithm, but rather to explain qualitative differencesbetween algorithms and to demonstrate the value of knowingconditional ranks a priori.

IV. SEQUENTIAL ORTHOGONAL MATCHING PURSUIT

The results summarized in the previous section suggest alarge performance gap between ML detection and practical al-gorithms such as thresholding, lasso and OMP, especially whenthe SNR is high. Specifically, as the SNR increases, the perfor-mance of these practical methods saturates at a scaling in thenumber of measurements that can be significantly higher thanthat for ML.In this section, we introduce an OMP-like algorithm, which

we call sequential orthogonal matching pursuit, that under fa-vorable conditions can break this barrier. Specifically, in somecases, the performance of SequOMP does not saturate at highSNR.

A. Algorithm: SequOMP

Given a received vector , threshold level , and de-tection order (a permutation on ), the algorithmproduces an estimate of the support with the followingsteps:

1) Initialize the counter and set the initial support esti-mate to empty: .

2) Compute where is the projection op-erator onto the orthogonal complement of the span of

.3) Compute the squared correlation between and

:

4) If , add the index to . That is,. Otherwise, set .

5) Increment to . If return to step 2.6) The final estimate of the support is .The SequOMP algorithm can be thought of as an iterative

version of thresholding with the difference that, after a nonzerocomponent is detected, subsequent correlations are performedonly in the orthogonal complement to the corresponding columnof . The method is identical to the standard OMP algorithmof [8]–[10], except that SequOMP passes through the data onlyonce, in a fixed order. For this reason, SequOMP is computa-tionally simpler than standard OMP.As simulations will illustrate later, SequOMP generally has

much worse performance than standard OMP. It is not intendedas a competitive practical alternative. Our interest in the algo-rithm lies in the fact that we can prove positive results for Se-quOMP. Specifically, we will be able to show that this simplealgorithm, when used in conjunction with known conditionalranks, can achieve a fundamentally better scaling at high SNRsthan what has been proven is achievable with methods such aslasso and OMP.

B. Sequential OMP Performance

The analyses in Section III hold for deterministic vectors. Recall the partially-random signal model (2) where is a

random variable while the value of conditionalon being nonzero remains deterministic; i.e., is determin-istic.Let denote the conditional energy of , conditioned on

(i.e., ). Then

(20)

We will call the power profile. Sincefor every , the average value of in (4) is given by

(21)

Also, in analogy with and in (6) and (7),define

Note that the power profile and the quantitiesand as defined above are deterministic.To simplify notation, we henceforth assume is the iden-

tity permutation, i.e., the detection order in SequOMP is simply. A key parameter in analyzing the performance of

Page 6: IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson

5924 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012

SequOMP is what we will call the minimum signal-to-interfer-ence and noise ratio (MSINR)

(22a)

where is given by

(22b)

The parameters and have simple interpretations: Sup-pose SequOMP has correctly detected for all . Then,in detecting , the algorithm sees the noise with power

plus, for each component , an interferencepower with probability . Hence, is the total averageinterference power seen when detecting , assuming perfectcancellation up to that point. Since the conditional power ofis , the ratio in (22a) represents the average SINRseen while detecting component . The value is the minimumSINR over all components.Theorem 1: Let , and the power profile

be deterministic quantities varying withthat satisfy

(23a)

(23b)

Also, assume the sequence of power profiles satisfies the limit

(23c)

Finally, assume that for all

(24)

for some where is defined in (14) and isdefined in (22a). Then, there exists a sequence of thresh-olds, , such that SequOMP with detection order

will achieve asymptotic reliable detection. Thesequence of threshold levels can be selected independent of thesequence of power profiles.

Proof: See Appendix A.The theorem provides a simple sufficient condition on the

number of measurements as a function of the MSINR , prob-ability , and dimension . The condition (23c) is somewhattechnical; we will verify its validity in examples. The remainderof this section discusses some of the implications of this the-orem.

C. Most Favorable Detection Order With Known ConditionalRanks

Suppose that the ordering of the conditional power levelsis known at the detector, but possibly not the values

themselves. Reordering the power profile is equivalent tochanging the detection order, so we seek the most favorableordering of the power profile. Since defined in (22b)

involves the sum of the tail of the power profile, the MSINRdefined in (22a) is maximized when the power profile is non-in-creasing:

(25)

In other words, the best detection order for SequOMP is fromstrongest component to weakest component.Using (25), it can be verified that the MSINR is bounded

below by

(26)

Furthermore, in cases of interest , so the sufficiency ofthe scaling (24) shows that

(27)

is sufficient for asymptotic reliable detection. This expressionis shown in Table I with the additional simplification that

for . To keep thenotation consistent with the expressions for the other entries inthe table, we have used for , which is the average numberof nonzero entries of .When , (27) simplifies to

(28)

This is identical to the lasso and OMP performance except forthe factor , which lies in for

. In particular, the minimum number of measurementsdoes not depend on ; therefore, similar to lasso and OMP,SequOMP can theoretically detect components that are muchbelow the average power at high SNRs. More generally, we cansay that knowledge of the conditional ranks of the powers en-ables a very simple algorithm to achieve resistance to large dy-namic ranges.

D. Optimal Power Shaping

The MSINR lower bound in (26) is achieved as andthe power profile is constant (all ’s are equal). Thus, oppositeto thresholding, a constant power profile is in some sense theworst power profile for a given for the SequOMP algo-rithm.This raises the question: What is the most favorable power

profile? Any power profile maximizing the MSINR subjectto a constraint on total SNR (21) will achieve the minimum in(22a) for every and thus satisfy

(29)

The solution to (29) and (21) is given by

(30a)

where

(30b)

Page 7: IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson

FLETCHER et al.: RANKED SPARSE SIGNAL SUPPORT DETECTION 5925

and the approximation holds for large .2 Again, some algebrashows that when is bounded away from zero, the power profilein (30) will satisfy the technical condition (23c) when

.The power profile (30a) is exponentially decreasing in the

index order . Thus, components early in the detection sequenceare allocated exponentially higher power than components laterin the sequence. This allocation insures that early componentshave sufficient power to overcome the interference from all thecomponents later in the detection sequence that are not yet can-celled.Substituting (30b) into (24), we see that the scaling

(31)

is sufficient for SequOMP to achieve asymptotic reliable de-tection with the best-case power profile. This expression isshown in Table I, again with the additional simplification that

for .

E. SNR Saturation

As discussed earlier, a major problem with thresholding,lasso, and OMP is that their performances “saturate” with highSNR. That is, even as the SNR scales to infinity, the minimumnumber of measurements scales as .In contrast, optimal ML detection can achieve a scaling

, when the SNR is sufficiently high.A consequence of (31) is that SequOMP with exponential

power shaping can overcome this barrier. Specifically, if wetake the scaling of in (31), apply the bound

for , and assume thatis bounded away from zero, we see that asymptotically, Se-

quOMP requires only measurements. In this way, un-like thresholding and lasso, SequOMP is able to succeed withscaling when . In fact, if growsslightly faster so that it satisfieswhile still satisfying , then (31)leads to an asymptotic sufficient condition of .

F. Power Shaping With Sparse Bayesian Learning

The fact that power shaping can provide benefits when com-bined with certain iterative detection algorithms confirms theobservations in the work of Wipf and Rao [41]. That work con-siders signal detection with a certain sparse Bayesian learning(SBL) algorithm. They show the following result: Supposehas nonzero components and , is the powerof the th largest component. Then, for a given measurementmatrix , there exist constants such that if

(32)

the SBL algorithm will correctly detect the sparsity pattern of.

2The solution (30) is the case of a more general result in Section IV.G;see (35).

The condition (32) shows that a certain growth in the powerscan guarantee correct detection. The parameters however de-pend in some complex manner on the matrix , so the appro-priate growth is difficult to compute. They also provide strongempirical evidence that shaping the power with certain profilescan greatly reduce the number of measurements needed.The results in this paper add to Wipf and Rao’s observations

showing that growth in the powers can also assist SequOMP.Moreover, for SequOMP, we can explicitly derive the optimalpower profile for certain large random matrices.This is not to say that SequOMP is better than SBL. In fact,

empirical results in [39] suggest that SBLwill outperform OMP,which will in turn do better than SequOMP. As we have stressedbefore, the point of analyzing SequOMP here is that we can de-rive concrete analytic results. These results may provide guid-ance for more sophisticated algorithms.

G. Robust Power Shaping

The above analysis shows certain benefits of SequOMP usedin conjunction with power shaping. The results are proven forreliable detection of all entries of the support in a limit of un-bounded block length (see Definition 1). In problems of finitesize or at operating points where a nonzero fraction of errors istolerable, the power shaping above may hurt performance.When a nonzero component is not detected in SequOMP,

that component’s energy is not cancelled out and remains as in-terference for all subsequent components in the detection se-quence. With power shaping, components early in the detec-tion sequence have much higher power than components laterin the sequence. Compared to a case with the same anda constant power profile, the use of power shaping reduces theprobability of an early missed detection but increases the harmin subsequent steps that comes from such a missed detection.As block length increases, the probability of missed detectioncan be driven to zero. But at any finite block length, the proba-bility of a missed detection early in the sequence will always benonzero.The work [42] observed a similar problem when successive

interference cancellation is used in a CDMA uplink. To mitigatethe problem, [42] proposed to adjust the power allocations tomake them more robust to detection errors early in the detectionsequence. The same technique, which we will call robust powershaping, can be applied to SequOMP as follows.The condition (29) is motivated by maintaining a constant

MSINR through the detection process, assuming all componentswith indexes have been correctly detected and subtracted.An alternative, following [42], is to assume that some fixed frac-tion of the energy of components early in the detectionsequence is not cancelled out due to missed detections. We willcall the leakage fraction. With nonzero leakage, the condition(29) is replaced by

(33)

For given , and , (33) in a system of linear equations thatdetermines the power profile ; one can vary until thepower profile provides the desired SNR according to (21).

Page 8: IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson

5926 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012

Fig. 1. Probability of error in full support recovery with SequOMP for a signalwith components when the power profile is optimized as in (35) withleakage fraction . Each shaded box presents the result of 2000 MonteCarlo trials with andSNR as indicated. The white line shows the theoretical sufficient condition onobtained from Theorem 1.

A closed-form solution to (33) provides some additional in-sight. Adding and subtracting inside the parentheses in(33) while also using (21) yields

which can be rearranged to

(34)

Using standard techniques for solving linear constant-coeffi-cient difference equations,

(35a)

where

(35b)

and

(35c)

Notice that implies , so the power profile (35a)is decreasing as in the case without leakage in Section IV-D.Setting recovers (30).

V. NUMERICAL SIMULATIONS

A. Basic Validation of Sufficient Condition

We first compare the actual performance of the SequOMPalgorithm with the sufficient condition for support recovery inTheorem 1. Fig. 1 shows the simulated probability of errorobtained using SequOMP at various SNR levels, probabilities of

Fig. 2. Comparison of SequOMP and OMP, with and without power shaping.Probability of error in full support recovery is plotted as a function of withnumber of components , probability of an entry being nonzero, and SNR dB. The power profile is either constant or optimized as in

(35) with leakage fraction .

nonzero components , and numbers of measurements . In allthese simulations, the number of components was fixed to, and each shaded box represents an empirical probability

of error over 2000 independent Monte Carlo trials. The robustpower profile of Section IV-G is used with a leakage fraction

. Here and in subsequent simulations, the thresholdis set to the level specified in (39) in the proof of Theorem 1 inAppendix A.3

The white line in Fig. 1 represents the number of measure-ments for which Theorem 1 would theoretically guaranteereliable detection of the support at infinite block lengths. Toapply the theorem, we used the MSINR from (35c). At theblock lengths considered in these simulations, the probabilityof error at the theoretical sufficient condition is small, typicallyunder 0.1%. The theoretical sufficient condition shows the sametrends as the empirical results.

B. Effect of Power Shaping

Fig. 2 compares the performances of SequOMP and OMP,with and without power shaping. In the simulations,

, and the total SNR is 25 dB. When power shaping isused, the power profile is determined through (35) with leakagefraction . Otherwise, the power profile is constant. Thenumber of measurements was varied, and for each , theprobability of error was estimated with 5000 independentMonteCarlo trials.As expected from the theoretical analysis in this paper, with

the total SNR kept constant, the performance of SequOMP isimproved by optimization of the power profile. As also is to beexpected, SequOMP is considerably worse than OMP in termsof error probability for a given number of measurements ornumber of measurements needed to achieve a given error proba-bility. Our interest in SequOMP is that it is amenable to analysis;OMP presumably performs better than SequOMP in any settingof interest, but it does not do so for every problem instance, soour analysis does not carry over to OMP rigorously.The simulation in Fig. 2 shows that power shaping provides

gains with OMP as well. As discussed in Section IV-F, this isconsistent with observations in the work of Wipf and Rao [41].

3Simulations presented in [43] use a different choice of . There, is adjustedto achieve a fixed false alarm probability and all plotted quantities are misseddetection probabilities. The conclusions are qualitatively similar.

Page 9: IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson

FLETCHER et al.: RANKED SPARSE SIGNAL SUPPORT DETECTION 5927

VI. CONCLUSION

Methods such as OMP and lasso, which are widely used insparse signal support detection problems, exhibit advantagesover thresholding but still fall far short of the performanceof optimal (ML) detection at high SNRs. Analysis of theSequOMP algorithm has shown that knowledge of conditionalrank of signal components enables performance similar toOMP and lasso at a lower complexity. Furthermore, in themost favorable situations, conditional rank knowledge changesthe fundamental scaling of performance with SNR so thatperformance no longer saturates with SNR.

APPENDIXPROOF OF THEOREM 1

A. Proof Outline

At a high level, the proof of Theorem 1 is similar to theproof of [5, Thm. 2], the thresholding condition (16). One ofthe difficulties in the proof is to handle the dependence betweenrandom events at different iterations of the SequOMP algorithm.To avoid this difficulty, we first show an equivalence betweenthe success of SequOMP and an alternative sequence of eventsthat is easier to analyze. After this simplification, small modifi-cations handle the cancellations of detected vectors.Fix and define , which

is the set of elements of the true support with indices .Observe that and .Let be the projection operator onto the orthogonal

complement of , and define

(36)

A simple induction argument shows that SequOMP correctlydetects the support if and only if, at each iteration , the vari-ables and defined in the algorithm are equalto and , respectively. Therefore, ifwe define , then SequOMP cor-rectly detects the support if and only if . In particular,

.To prove that it suffices to show that there exists

a sequence of threshold levels such that

(37a)

(37b)

hold in probability. The first limit (37a) ensures that all the com-ponents in the true support will not be missed and will be calledthe zero missed detection condition. The second limit (37b) en-sures that all the components not in the true support will not befalsely detected and will be called the zero false alarm condi-tion.Set the sequence of threshold levels as follows. Since ,

we can find an such that

(38)

For each , let the threshold level be

(39)

The asymptotic lack of missed detections and false alarms withthese thresholds are proven in Appendices D and E, respec-tively. In preparation for these sections, Appendix B reviewssome facts concerning tail bounds on chi-squared and betarandom variables and Appendix C presents some preliminarycomputations.

B. Chi-Squared and Beta Random Variables

The proof requires a number of simple facts concerningchi-squared and beta random variables. These variables arereviewed in [44]. We omit all the proofs in this subsection andinstead reference very closely related lemmas in [5].A random variable has a chi-squared distribution with

degrees of freedom if it can be written as , whereare i.i.d. .Lemma 1 ([5], Lemma 2): Suppose has a Gaussian

distribution . Then:a) is chi-squared with degrees of freedom; andb) if is any other -dimensional random vector that isnonzero with probability one and independent of , thenthe variable is a chi-squared randomvariable with one degree of freedom.

The following two lemmas provide standard tail bounds.Lemma 2 (Similar to [5], Lemma 3): Suppose that for

each is a set of Gaussian random vectors with

each spherically symmetric in an -dimensionalspace. The variables may be dependent. Suppose also that

and where. Then the limits

hold in probability.Lemma 3 ([5], Lemma 4): Suppose that for each

is a set of chi-squared random variables, each withone degree of freedom. The variables may be dependent. Then

where the limit is in probability.The final two lemmas concern certain beta-distributed

random variables. A real-valued scalar random variablefollows a distribution if it can be written as

, where the variables and are inde-pendent chi-squared random variables with and degrees offreedom, respectively. The importance of the beta distributionis given by the following lemma.Lemma 4 ([5, Lemma 5]): Suppose and are indepen-

dent random -dimensional random vectors with being spher-ically-symmetrically distributed in and having any distri-bution that is nonzero with probability one. Then the randomvariable is independent of and fol-lows a distribution.The following lemma provides a simple expression for the

maxima of certain beta-distributed variables.

Page 10: IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson

5928 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012

Lemma 5 ([5, Lemma 6]): For each , suppose is

a set of random variables with having adistribution. Suppose that

where . Then,

in probability.

C. Preliminary Computations and Technical Lemmas

We first need to prove several simple but technical bounds.We begin by considering the dimension defined as

(40)

Our first lemma computes the limit of this dimension.Lemma 6: The limit

(41)

holds in probability and almost surely.Proof: Recall that is the projection onto the or-

thogonal complement of the vectors with .With probability one, these vectors will be linearly independent,so will have dimension . Sinceis increasing with

(42)

Since each index is in the support with probability and the sare independent, the law of large numbers shows that

in probability and almost surely. Combining this with (42) and(23b) shows (41).Next, for each , define the residual vector,

(43)

Observe that

where (a) follows from (1) and (b) follows from the fact thatis the projection onto the orthogonal complement of

the span of all vectors with and .The next lemma shows that the power of the residual vector

is described by the random variable

(44)

Lemma 7: For all , the residual vector , condi-tioned on the modulation vector and projection , is a

spherically symmetric Gaussian in the range space ofwith total variance

(45)

where and are defined in (40) and (44), respectively.Proof: Let , so that .

Since the vectors and have Gaussian distri-butions, for a given vector must be a zero-mean whiteGaussian vector with total variance . Also,since the operator is a function of the componentsand vectors for is independent of the vec-tors and , and therefore independent of . Since

is a projection from an -dimensional space to an-dimensional space, , conditioned on themodulation vector, must be spherically symmetric Gaussian in the range spaceof with total variance satisfying (45).Our next lemma requires the following version of the well-

known Hoeffding’s inequality.Lemma 8 (Hoeffding’s Inequality): Suppose is the sum

where is a constant and the variables areindependent random variables that are almost surely boundedin some interval . Then, for all

where .Proof: See [45].

Lemma 9: Under the assumptions of Theorem 1, the limit

holds in probability.Proof: Let . From the definition ofin (44), we can write

where for .Now recall that in the problem formulation, each is

nonzero with probability , with conditional power . Also,the activity variables are independent, and the con-ditional powers are deterministic quantities. Therefore, thevariables are independent with

for . Combining this with the definition of in (22b),we see that

Also, for each , we have the bound .So for use in Hoeffding’s Inequality (Lemma 8), define

Page 11: IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson

FLETCHER et al.: RANKED SPARSE SIGNAL SUPPORT DETECTION 5929

where dependence of the power profile and on is implicit.Now define

so that for all . Hoeffding’s Inequality(Lemma 8) now shows that for all

Using the union bound,

The final step is due to the fact that the technical condition (23c)in the theorem implies . This proves the lemma.

D. Missed Detection Probability

Consider any . Using (43) to rewrite (36) along withsome algebra shows

(46)

where

(47)

(48)

Define

We will now bound from below and from above.We first start with . Conditional on and ,

Lemma 7 shows that each is a spherically-symmetricallydistributed Gaussian on the -dimensional range space of

. Since there are asymptotically elements in ,Lemma 2 along with (23b) show that

(49)

where the limit is in probability. Similarly, is alsoa spherically-symmetrically distributed Gaussian in the rangespace of . Since is a projection from an -di-mensional space to a -dimensional space and ,

we have that . Therefore, Lemma 2along with (23b) show that

(50)

Taking the limit (in probability) of

(51)

where (a) follows from (47); (b) follows from (49) and (50); (c)follows from (20); (d) follows from Lemma 9; and (e) followsfrom (22a).We next consider . Conditional on , the vectors

and are independent spherically-symmetricGaussians in the range space of . It follows fromLemma 4 that each is a random variable.Since there are asymptotically elements in , Lemma 5along with (41) and (23b) show that

(52)

The above analysis shows that for any

(53)

where (a) follows from the definitions of and ; (b) fol-lows from (51) and (52); (c) follows from (24); (d) follows from

Page 12: IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson

5930 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012

(14); (e) follows from (39); and (f) follows from (38). Therefore,starting with (46),

where (a) follows from (46); (b) follows from (53); (c) followsfrom the fact that (it is a beta-distributed randomvariable); (d) follows from (51). This proves the first require-ment, condition (37a).E. False Alarm Probability

Now consider any index . This implies thatand therefore (43) shows that

Hence from (36),

(54)

where is defined in (48). From the discussion above, eachhas the distribution. Since there are asymptot-ically elements in , the conditions (41) and (23b)along with Lemma 5 show that the limit

(55)

holds in probability. Therefore,

where (a) follows from (54); (b) follows from (39); and (c) fol-lows from (55). This proves (37b) and thus completes the proofof the theorem.

ACKNOWLEDGMENT

The authors thank Martin Vetterli for his support, wisdom,and encouragement. The authors also thank Gerhard Kramerfor helpful comments on an early draft and the anonymous re-viewers and Associate Editor for several useful suggestions.

REFERENCES

[1] A. K. Fletcher, S. Rangan, andV.KGoyal, “A sparsity detection frame-work for on-off random access channels,” in Proc. IEEE Int. Symp. Inf.Theory, Seoul, Korea, Jun.–Jul. 2009, pp. 169–173.

[2] S. Mallat, A Wavelet Tour of Signal Processing, 2nd ed. New York:Academic, 1999.

[3] M. J. Wainwright, “Information-theoretic limits on sparsity recovery inthe high-dimensional and noisy setting,” IEEE Trans. Inf. Theory, vol.55, no. 12, pp. 5728–5741, Dec. 2009.

[4] M. J. Wainwright, “Sharp thresholds for high-dimensional andnoisy sparsity recovery using -constrained quadratic programming(Lasso),” IEEE Trans. Inf. Theory, vol. 55, no. 5, pp. 2183–2202, May2009.

[5] A. K. Fletcher, S. Rangan, and V. K Goyal, “Necessary and sufficientconditions for sparsity pattern recovery,” IEEE Trans. Inf. Theory, vol.55, no. 12, pp. 5758–5772, Dec. 2009.

[6] B. K. Natarajan, “Sparse approximate solutions to linear systems,”SIAM J. Comput., vol. 24, no. 2, pp. 227–234, Apr. 1995.

[7] S. G. Mallat and Z. Zhang, “Matching pursuits with time-frequencydictionaries,” IEEE Trans. Signal Process., vol. 41, no. 12, pp.3397–3415, Dec. 1993.

[8] S. Chen, S. A. Billings, andW. Luo, “Orthogonal least squares methodsand their application to non-linear system identification,” Int. J. Con-trol, vol. 50, no. 5, pp. 1873–1896, Nov. 1989.

[9] Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonalmatchingpursuit: Recursive function approximation with applications to waveletdecomposition,” inProc. Conf. Rec. 27th Asilomar Conf. Signals, Syst.,Comput., Pacific Grove, CA, Nov. 1993, vol. 1, pp. 40–44.

[10] G. Davis, S. Mallat, and Z. Zhang, “Adaptive time-frequency decom-position,” Opt. Eng., vol. 33, no. 7, pp. 2183–2191, Jul. 1994.

[11] D. Needell and J. A. Tropp, “CoSaMP: Iterative signal recovery fromincomplete and inaccurate samples,” Appl. Comput. Harm. Anal., vol.26, no. 3, pp. 301–321, May 2009.

[12] W. Dai and O. Milenkovic, “Subspace pursuit for compressive sensingsignal reconstruction,” IEEE Trans. Inf. Theory, vol. 55, no. 5, pp.2230–2249, May 2009.

[13] S. S. Chen, D. L. Donoho, andM.A. Saunders, “Atomic decompositionby basis pursuit,” SIAM J. Sci. Comp., vol. 20, no. 1, pp. 33–61, 1999.

[14] R. Tibshirani, “Regression shrinkage and selection via the lasso,” J.Royal Stat. Soc., Ser. B, vol. 58, no. 1, pp. 267–288, 1996.

[15] E. J. Candès and T. Tao, “The Dantzig selector: Statistical estima-tion when is much larger than ,” Ann. Stat., vol. 35, no. 6, pp.2313–2351, Dec. 2007.

[16] D. L. Donoho, M. Elad, and V. N. Temlyakov, “Stable recovery ofsparse overcomplete representations in the presence of noise,” IEEETrans. Inf. Theory, vol. 52, no. 1, pp. 6–18, Jan. 2006.

[17] J. A. Tropp, “Greed is good: Algorithmic results for sparse approxima-tion,” IEEE Trans. Inf. Theory, vol. 50, no. 10, pp. 2231–2242, Oct.2004.

[18] J. A. Tropp, “Just relax: Convex programming methods for identifyingsparse signals in noise,” IEEE Trans. Inf. Theory, vol. 52, no. 3, pp.1030–1051, Mar. 2006.

[19] E. J. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles:Exact signal reconstruction from highly incomplete frequency infor-mation,” IEEE Trans. Inf. Theory, vol. 52, no. 2, pp. 489–509, Feb.2006.

[20] D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory, vol.52, no. 4, pp. 1289–1306, Apr. 2006.

[21] E. J. Candès and T. Tao, “Near-optimal signal recovery from randomprojections: Universal encoding strategies?,” IEEE Trans. Inf. Theory,vol. 52, no. 12, pp. 5406–5425, Dec. 2006.

[22] S. Rangan, A. Fletcher, and V. K Goyal, “Asymptotic analysis of MAPestimation via the replica method and applications to compressedsensing,” IEEE Trans. Inf. Theory, vol. 58, no. 3, pp. 1902–1923, Mar.2012.

[23] D. L. Donoho and J. Tanner, “Counting faces of randomly-projectedpolytopes when the projection radically lowers dimension,” J. Amer.Math. Soc., vol. 22, no. 1, pp. 1–53, Jan. 2009.

[24] S. Sarvotham, D. Baron, and R. G. Baraniuk, “Measurements vs. bits:Compressed sensing meets information theory,” in Proc. 44th Ann.Allerton Conf. Commun., Contr., Comp., Monticello, IL, Sep. 2006.

[25] A. K. Fletcher, S. Rangan, and V. K Goyal, “Rate-distortion bounds forsparse approximation,” in Proc. IEEE Statist. Signal Process. Work-shop, Madison, WI, Aug. 2007, pp. 254–258.

[26] G. Reeves, “Sparse signal sampling using noisy linear projections,”Univ. of Calif., Dep. Elec. Eng. and Comp. Sci., Berkeley, Tech. Rep.UCB/EECS-2008-3, Jan. 2008.

[27] M. Akçakaya and V. Tarokh, “Shannon-theoretic limits on noisycompressive sampling,” IEEE Trans. Inf. Theory, vol. 56, no. 1, pp.492–504, Jan. 2010.

[28] J. A. Tropp and A. C. Gilbert, “Signal recovery from random measure-ments via orthogonal matching pursuit,” IEEE Trans. Inf. Theory, vol.53, no. 12, pp. 4655–4666, Dec. 2007.

Page 13: IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11 ... · IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 11, NOVEMBER 2012 5919 Ranked Sparse Signal Support Detection Alyson

FLETCHER et al.: RANKED SPARSE SIGNAL SUPPORT DETECTION 5931

[29] A. K. Fletcher and S. Rangan, “Orthogonal matching pursuit: ABrownian motion analysis,” IEEE Trans. Signal Process., vol. 60, no.3, pp. 1010–1021, Mar. 2012.

[30] H. Rauhut, K. Schnass, and P. Vandergheynst, “Compressed sensingand redundant dictionaries,” IEEE Trans. Inf. Theory, vol. 54, no. 5,pp. 2210–2219, May 2008.

[31] R. G. Baraniuk, V. Cevher, M. F. Duarte, and C. Hegde, “Model-basedcompressed sensing,” IEEE Trans. Inf. Theory, vol. 56, no. 4, pp.1982–2001, Apr. 2010.

[32] N. Vaswani and W. Lu, “Modified-CS: Modifying compressivesensing for problems with partially known support,” IEEE Trans.Signal Process., vol. 58, no. 9, pp. 4595–4607, Sep. 2010.

[33] M. A. Khajehnejad, W. Xu, A. S. Avestimehr, and B. Hassibi, “An-alyzing weighted minimization for sparse recovery with nonuni-form sparse models,” IEEE Trans. Signal Process., vol. 59, no. 5, pp.1985–2001, May 2011.

[34] S. Rangan, A. K. Fletcher, V. K Goyal, and P. Schniter, “Hybrid ap-proximate message passing with applications to structured sparsity,”Nov. 2011 [Online]. Available: http://arxiv.org/abs/1111.2581

[35] S. Rangan, “Generalized approximate message passing for estima-tion with random linear mixing,” Oct. 2010 [Online]. Available:http://arxiv.org/abs/1010.5141

[36] W. Wang, M. J. Wainwright, and K. Ramchandran, “Information-the-oretic limits on sparse signal recovery: Dense versus sparse measure-ment matrices,” IEEE Trans. Inf. Theory, vol. 56, no. 6, pp. 2967–2979,Jun. 2010.

[37] M. F. Duarte, S. Sarvotham, D. Baron, W. B. Wakin, and R. G. Bara-niuk, “Distributed compressed sensing of jointly sparse signals,” inProc. Conf. Rec. Asilomar Conf. Signals, Syst. Comput., Pacific Grove,CA, Oct.–Nov. 2005, pp. 1537–1541.

[38] M. Tipping, “Sparse Bayesian learning and the relevance vector ma-chine,” J. Mach. Learn. Res., vol. 1, pp. 211–244, Sep. 2001.

[39] D. Wipf and B. Rao, “Sparse Bayesian learning for basis selection,”IEEE Trans. Signal Process., vol. 52, no. 8, pp. 2153–2164, Aug. 2004.

[40] P. Schniter, L. C. Potter, and J. Ziniel, “Fast Bayesian matching pursuit:Model uncertainty and parameter estimation for sparse linear models,”IEEE Trans. Signal Process., Aug. 2008, submitted for publication.

[41] D. Wipf and B. Rao, “Comparing the effects of different weight distri-butions on finding sparse representations,” presented at the Proc. Neur.Inf. Process. Syst., Vancouver, Canada, Dec. 2006.

[42] A. Agrawal, J. G. Andrews, J. M. Cioffi, and T. Meng, “Iterativepower control for imperfect successive interference cancellation,”IEEE Trans. Wireless Commun., vol. 4, no. 3, pp. 878–884, May 2005.

[43] A. K. Fletcher, S. Rangan, and V. K Goyal, “Ranked sparse signalsupport detection,” Oct. 2011 [Online]. Available: http://arxiv.org/abs/1110.6188

[44] M. Evans, N. Hastings, and J. B. Peacock, Statistical Distributions, 3rded. New York: Wiley, 2000.

[45] W. Hoeffding, “Probability inequalities for sums of bounded randomvariables,” J. Amer. Stat. Assoc., vol. 58, no. 301, pp. 13–30,Mar. 1963.

Alyson K. Fletcher (S’03–M’04) received the B.S.degree in mathematics from the University of Iowa,Iowa City. She received the M.S. degree in electricalengineering in 2002, theM.A. degree inmathematics,and the Ph.D. degree in electrical engineering, both in2006, all from the from the University of California,Berkeley.She is an Assistant Professor of Electrical Engi-

neering in the Jack Baskin School of Engineering,University of California at Santa Cruz. Her researchinterests include signal processing, information

theory, machine learning, and neuroscience.

Dr. Fletcher is a member of SWE, SIAM, and Sigma Xi. In 2005, she receivedthe University of California Eugene L. Lawler Award, the Henry Luce Founda-tions Clare Boothe Luce Fellowship, the Soroptimist Dissertation Fellowship,and University of California President’s Postdoctoral Fellowship.

Sundeep Rangan (M’02) received the B.A.Sc. de-gree from the University of Waterloo, Canada, andthe M.S. and Ph.D. degrees from the University ofCalifornia, Berkeley, all in electrical engineering.He held postdoctoral appointments with the

University of Michigan, Ann Arbor, and Bell Labs.In 2000, he cofounded (with four others) FlarionTechnologies, a spin-off of Bell Labs, that devel-oped Flash OFDM, one of the first cellular OFDMdata systems. In 2006, Flarion was acquired byQualcomm Technologies, where he was a Director

of Engineering involved in OFDM infrastructure products. He joined theDepartment of Electrical and Computer Engineering, Polytechnic Institute ofNew York University, in 2010, where he is currently an Associate Professor.His research interests are in wireless communications, signal processing,information theory, and control theory.

Vivek K Goyal (S’92–M’98–SM’03) received theB.S. degree in mathematics and the B.S.E. degreein electrical engineering from the University ofIowa, Iowa City, where he received the John BriggsMemorial Award for the top undergraduate acrossall colleges. He received the M.S. and Ph.D. degreesin electrical engineering from the University ofCalifornia, Berkeley, where he received the EliahuJury Award for outstanding achievement in systems,communications, control, or signal processing.He was a Member of Technical Staff in the Math-

ematics of Communications Research Department of Bell Laboratories, LucentTechnologies, from 1998 to 2001; and a Senior Research Engineer for DigitalFountain, Inc., during 2001–2003. He has been with the Massachusetts Instituteof Technology, Cambridge, since 2004. His research interests include computa-tional imaging, sampling, quantization, and source coding theory.Dr. Goyal is a member of Phi Beta Kappa, Tau Beta Pi, Sigma Xi, Eta Kappa

Nu, and SIAM. He was awarded the 2002 IEEE Signal Processing Society Mag-azine Award and an NSF CAREER Award. As a research supervisor, he is coau-thor of papers that won Student Best Paper awards at the IEEE Data Compres-sion Conference in 2006 and 2011 and the IEEE Sensor Array andMultichannelSignal Processing Workshop in 2012. He served on the IEEE Signal ProcessingSociety’s Image and Multiple Dimensional Signal Processing Technical Com-mittee from 2003 to 2009. He is a Technical Program Committee Co-Chair ofIEEE ICIP 2016 and a permanent Conference Co-Chair of the SPIE Waveletsand Sparsity conference series.


Recommended