Abstract
This article describes computationally efficient approaches and associated theoretical performance guarantees for the detection of known targets and anomalies from few projection measurements of the underlying signals. The proposed approaches accommodate signals of different strengths contaminated by a colored Gaussian background, and perform detection without reconstructing the underlying signals from the observations. The theoretical performance bounds of the target detector highlight fundamental tradeoffs among the number of measurements collected, amount of background signal present, signaltonoise ratio, and similarity among potential targets coming from a known dictionary. The anomaly detector is designed to control the number of false discoveries. The proposed approach does not depend on a known sparse representation of targets; rather, the theoretical performance bounds exploit the structure of a known dictionary of targets and the distance preservation property of the measurement matrix. Simulation experiments illustrate the practicality and effectiveness of the proposed approaches.
Keywords:
Target detection; Anomaly detection; False discovery rate; _{i}pvalue; Incoherent projections; Compressive sensingIntroduction
The theory of compressive sensing (CS) has shown that it is possible to accurately
reconstruct a sparse signal from few (relative to the signal dimension) projection measurements
[1,2]. Though such a reconstruction is crucial to visually inspect the signal, there are
many instances where one is solely interested in identifying whether the underlying
signal is one of several possible signals of interest. In such situations, a complete
reconstruction is computationally expensive and does not optimize the correct performance
metric. Recently, CS ideas have been exploited in
[35] to perform target detection and classification from projection measurements, without
reconstructing the underlying signal of interest. In
[3,5], the authors propose nearestneighbor based methods to classify a signal
For example, recent advances in CS have led to the development of new spectral imaging platforms which attempt to address challenges in conventional imaging platforms related to system size, resolution, and noise by acquiring fewer compressive measurements than spatiospectral voxels [1621]. However, these system designs have a number of degrees of freedom which influence subsequent data analysis. For instance, the singleshot compressive spectral imager discussed in [18] collects one coded projection of each spectrum in the scene. One projection per spectrum is sufficient for reconstructing spatially homogeneous spectral images, since projections of neighboring locations can be combined to infer each spectrum. Significantly more projections are required for detecting targets of unknown strengths without the benefit of spatial homogeneity. We are interested in investigating how several such systems can be used in parallel to reliably detect spectral targets and anomalies from different coded projections.
In general, we consider a broadly applicable framework that allows us to account for background and sensor noise, and perform target detection directly from projection measurements of signals obtained at different spatial or temporal locations. The precise problem formulation is provided below.
Problem formulation
Let us assume access to a dictionary of possible targets of interest
where
• i∈{1,…,M} indexes the spatial or temporal locations at which data are collected;
• α_{i}≥0 is a measure of the signaltonoise ratio at location i, which is either known or estimated from observations;
•
For example, in the case of spectral imaging
(1) Dictionary signal detection (DSD): Here we assume that each
(2) Anomalous signal detection (ASD): Here, our task is to detect all signals which
are not members of our dictionary, i.e., detect
Our goal is to accurately perform DSD or ASD without reconstructing the spectral input
In this article, we develop detection performance bounds which show how performance
scales with the number of detectors in a compressive setting as a function of SNR,
the similarity between potential targets in a known dictionary, and their prior probabilities.
Our bounds are based on a detection strategy which operates directly on the collected
data as opposed to first reconstructing each
Performance metric
To assess the performance of our detection strategies, we consider the false discovery rate (FDR) metric and related quantities developed for multiple hypothesis testing problems [29]. Since we collect M independent observations of potentially different signals, we are simultaneously conducting M hypothesis tests when we search for targets. Unlike the probability of false alarm, which measures the probability of falsely declaring a target for a single test, the FDR measures the fraction of declared targets that are false alarms, that is, it provides information about the entire set of M hypotheses instead of just one. More formally, the FDR is given by,
where V is the number of falsely rejected null hypotheses, and R is the total number of rejected null hypotheses. Controlling the FDR in a multiple hypothesis testing framework is akin to designing a constant false alarm rate (CFAR) detector in spectral target detection applications that keeps the false alarm rate at a desired level irrespective of the background interference and sensor noise statistics [22].
Previous investigations
Much of the classical target detection literature
[3034] assume that each target lies in a Pdimensional subspace of
Recently, several methods for target or anomaly detection that rely on recovering
the full spatiospectral data from projection measurements
[36,37] have been proposed. However, they are computationally intensive and the detection
performance associated with these reconstructions is unknown. Other researchers have
exploited CS to perform target detection and classification without reconstructing
the underlying signal
[35]. Duarte et al.
[4] propose a matching pursuit based algorithm, called the incoherent detection and estimation
algorithm (IDEA), to detect the presence of a signal of interest against a strong
interfering signal from noisy projection measurements. The algorithm is shown to perform
well on experimental data sets under some strong assumptions on the sparsity of the
signal of interest and the interfering signal. Davenport et al.
[3] develop a classification algorithm called the smashed filter to classify an image
in
The authors of a more recent work [38] extend the classical RX anomaly detector [39] to directly detect anomalies from random, orthonormal projection measurements without an intermediate reconstruction step. They numerically show how the detection probability improves as a function of the signaltonoise ratio when the number of measurements changes. Though probability of detection is a good performance measure, in many applications controlling the false discoveries below a desired level is more crucial. As a result, in our work, we propose an anomaly detection method that controls the FDR below a desired level.
Contributions
This article makes the following contributions to the above literature:
• A compressive target detection approach, which (a) is computationally efficient, (b) allows for the signal strengths of the targets to vary with spatial location, (c) allows for backgrounds mixed with potential targets, (d) considers targets with different a priori probabilities, and (e) yields theoretical guarantees on detector performance. This article unifies preliminary work by the authors [40,41], presents previously unpublished aspects of the proofs, and contains updated experimental results.
• A computationally efficient anomaly detection method that detects anomalies of different strengths from projection measurements and also controls the FDR at a desired level.
• A whitening filter approach to compressive measurements of signals with background contamination, and associated analysis leading to bounds on the amount of background to which our detection procedure is robust.
The above theoretical results, which are the main focus of this article, are supported with simulation studies in Section “Experimental results”. Classical detection methods described in [22,26,27,3035,39,4245] do not establish performance bounds as a function of signal resolution or target dictionary properties and rely on relatively direct observation models which we show to be suboptimal when the detector size is limited. The methods in [3,4] do not contain performance analysis, and our analysis builds upon the analysis in [5] to account for several specific aspects of the compressive target detection problem.
Whitening compressive observations
Before we present our detection methods for DSD and ASD problems, respectively, we briefly discuss a whitening step that is common to both our problems of interest.
Let us suppose that there are enough background training data available to estimate
the background mean μ_{b} and covariance matrix Σ_{b}. We can assume without loss of generality that μ_{b}=0since Φμ_{b} can be subtracted from y. Given the knowledge of the background statistics, we can transform the background
and sensor noise model
where
and
We can now choose Φ so that the corresponding Ahas certain desirable properties as detailed in Sections “Dictionary signal detection” and “Anomalous signal detection”.
For a given A, the following theorem provides a construction of Φthat satisfies (3) and a bound on the maximum tolerable background contamination:
Theorem 1
Let B=I−AΣ_{b}A^{T}. If the largest eigenvalue of Σ_{b} satisfies
where ∥A∥ is the spectral norm of A, then B is positive definite and Φ=σB^{−1/2}Ais a sensing matrix, which can be used in conjunction with a whitening filter to produce observations modeled in (2).
The proof of this theorem is provided in Appendix 1. This theorem draws an interesting relationship between the maximum background perturbation that the system can tolerate and the spectral norm of the measurement matrix, which in turn varies with K and N. Hardware designs such as those in [17,19] use spatial light modulators and digital micro mirrors, which allow the measurement matrix Φ to be adjusted easily in response to changing background statistics and other operating conditions.
In the sections that follow, we consider collecting measurements of the form
Dictionary signal detection
Suppose that the end user wants to test for the presence of one known target versus
the rest, but it is not known a priori which target from
where
Decision rule
We define our decision rule corresponding to target
where
The decision rule can be formally expressed in terms of the significance regions as follows:
We analyze this detector by extending the positive FDR (pFDR) error measure introduced by Storey to characterize the errors encountered in performing multiple, independent and nonidentical hypothesis tests simultaneously [48]. The pFDR, discussed formally below, is the fraction of falsely rejected null hypotheses among the total number of rejected null hypotheses, subject to the positivity condition that one rejects at least one null hypothesis. The pFDR is similar to the FDR except that the positivity condition is enforced here. In our context, the positivity condition means that we declare at least one signal to be a nontarget, which in turn implies that the scene of interest is composed of more than one object in the case of spectral imaging, or that the scene is not static in the case of video imaging.
Consider a collection of significance regions
where
is the number of falsely rejected null hypotheses,
is the total number of rejected null hypotheses, and
Theorem 2
Given observations of the form (2), if one performs multiple, independent, nonidentical hypothesis tests of the form (5) and decides according to (7), then the worstcase pFDR given by pFDR_{max}=max_{j∈{1,…,m}}pFDR^{(j)}(Γ), satisfies the following bound:
where
The proof of this theorem is detailed in Appendix 2. A key element of our proof is the adaptation of the techniques from [48] to nonidentical independent hypothesis tests.
An achievable bound on the worstcase pFDR
Theorem 2 in the preceding section shows that, for a given A, the worstcase pFDR is bounded from above by a function of the worstcase misclassification
probability. In this section, we use this theorem to establish an achievable bound
on the worstcase pFDR that explicitly depends on the number of observations K, signal
strengths
Let us first define the quantities
Then we have the following theorem, whose proof is given in Appendix 3:
Theorem 3
Let λ_{max} denote the largest eigenvalue of Σ_{b}. For a given 0 < ε < 1−p_{max}, assume that K and N are sufficiently large so that the following conditions hold:
Then there exists a K×N sensing matrix Athat satisfies the condition of Theorem 1, and for which
This result has the following implications and consequences:
(1) For a given N, the upper bound (13b) on λ_{max}increases as K increases, which implies that the system can tolerate more background perturbation if we collect more measurements.
(2) The pFDR bound (14) decays with the increase in the values of K, d_{min}and α_{min}, and increases as p_{min}decreases. For a fixed p_{max}, p_{min}, α_{min}and d_{min}, the bound in (14) enables one to choose a value of K to guarantee a desired pFDR value.
(3) The dominant part of the bound (14) is independent of N, and is only a function
of K, p_{max}, p_{min}, α_{min}, and d_{min}. The lack of dependence on N is not unexpected. Indeed, when we are interested in
preserving pairwise distances among the members of a fixed dictionary of size m, the
Johnson–Lindenstrauss lemma
[49] says that, with high probability,
(4) The bound on K given in (13c) increases logarithmically with the increase in the
difference between p_{max} and p_{min}. This is to be expected since one would need more measurements to detect a less probable
target as our decision rule weights each target by its a priori probability. If all
targets are equally likely, then p_{max}=p_{min}=1/m, and
(where the first inequality holds since K < N). In addition, the lower bound on K
also illustrates the interplay between the signal strength of the targets, the similarity
among different targets in
Inspection of the proof shows that if Ais generated according to a Gaussian distribution, then the conditions of Theorem 3 will be met with high probability.
Extension to a manifoldbased target detection framework
The DSD problem formulation in Section “ASD problem formulation” is accurate if the
signals in the dictionary are faithful representations of the target signals that
we observe. In reality, however, the target signals will differ from the dictionary
signals owing to the differences in the experimental conditions under which they are
collected. For instance, in spectral imaging applications, the observed spectrum of
any material will not match the reference spectrum of the same material observed in
a laboratory because of the differences in atmospheric and illumination conditions.
To overcome this problem, one could form a large dictionary to account for such uncertainties
in the target signals and perform target detection according to the approaches discussed
in Sections “Whitening compressive observations” and “Dictionary signal detection”.
A potential drawback with this approach is that our theoretical performance bound
increases with the size of
Let us consider a dictionary of manifolds
(1) Given {y_{i}}, form a datadependent dictionary
for ℓ∈{1,…,m} and i=1,…,M.
(2) Given
and declare that the ith observed spectrum corresponds to class j if
This twostep procedure is studied in
[3] for the case
where
Anomalous signal detection
The target detection approach discussed above assumes that the target signal of interest resides in a dictionary that is available to the user. However, in some applications (such as military applications and surveillance), one might be interested in detecting objects not in the dictionary. In other words, the target signals of interest are anomalous and are not available to the user. In this section, we show how the target detection methods discussed above can be extended to anomaly detection. In particular, we exploit the distance preservation property of the sensing matrix A to detect anomalous targets from projection measurements.
ASD problem formulation
Given observations of the form in (2), we are interested in detecting whether
where
Note that the definition of the hypotheses given in (15a) and (15b) matches the definition in (5) for the special case where the dictionary contains just one signal. In this special case, the signal input f^{∗} is in the dictionary under the null hypothesis in both DSD and ASD problem formulations.^{b}
Anomaly detection approach
Our anomaly detection approach and the associated theoretical analysis are based on a “distance preservation” property of A, which is stated formally in (18). We propose an anomaly detection method that controls the FDR below a desired level δ for different background and sensor noise statistics. In other words, we control the expected ratio of falsely declared anomalies to the total number of signals declared to be anomalous. Note that here we work with the FDR as opposed to the pFDR, since it is possible for a scene to not contain any anomalies at all. We let V/R=0 for R=V=0 since one does not declare any signal to be anomalous in this case. In [29], Benjamini and Hochberg discuss a pvalue based procedure, “BH procedure”, that controls the FDR of M independent hypothesis tests below a desired level. Let,
be the test statistic at the ith location. The pvalue can be defined in terms of our test statistic as follows:
where
To apply this procedure in our setting, we need to find a tractable expression for
the pvalue at every location. This can be accomplished when A satisfies the distancepreservation condition stated below. Let
The existence of such projection operators is guaranteed by the celebrated Johnson
and Lindenstrauss (JL) lemma
[49], which says that there exists random constructions of A for which (18) holds with probability at least 1−2V^{2}e^{−Kc(ε)}provided
We now state our main theorem that gives a tight upper bound on the pvalue at every
location when {α_{i}} are unknown and are estimated from the observations. Let
for i=1,…,M where ζ∈[0,1] is a measure of the accuracy of the estimation procedure.
Theorem 4
If the ith hypothesis test is defined according to (15a) and (15b), the projection
matrix A satisfies (18) for a given ε∈(0,1), and the estimates
holds for all i=1,…,M where
The proof of this theorem is given in Appendix 4. We find the pvalue upper bounds
at every location and use the BH procedure to perform anomaly detection. The performance
of this procedure depends on the values of K, {α_{i}}, τ and ε. The parameter ε is a measure of the accuracy with which the projection
matrix A preserves the distances between any two vectors in
One can easily estimate {α_{i}} from {y_{i}} for some choices of A. For instance, if the entries of the projection matrix Aare drawn from
In practice, we use
for some absolute constants C,c > 0. This result shows that with high probability,
The experimental results discussed in Section “Experimental results” demonstrate the performance of this detector as a function of K, {α_{i}} and τ when {α_{i}} are known and as a function of K, τ and ζ when {α_{i}} are estimated.
Experimental results
In the experiments that follow, the entries of Aare drawn from
Dictionary signal detection
To test the effectiveness of our approach, we formed a dictionary
We evaluate the performance of our detector (7) on the transformed observations, relative
to the number of measurements K, by comparing the detection results to the ground
truth. Our MAP detector returns a label
where m is the number of signals in
where
Figure 1. Compressive target detection results under the AK ({α} known) and AU ({α_{i}} unknown) cases respectively as a function of K.(a) Comparison of the worstcase empirical pFDR curves with the theoretical bounds when SNR is high. (b) Comparison of the results obtained by the proposed method using projection measurements using Φdesigned according to (24), Φchosen at random, and the ones using downsampled measurements (DM) when the SNR is low.
In the experiment that follows, we let
For an input spectrum
where
where
Anomaly detection
In this section, we evaluate the performance of our anomaly detection method on (a) a simulated dataset and provide a comparison of the results obtained using the proposed projection measurements and the ones obtained using downsampled measurements, and (b) real AVIRIS (Airborne Visible InfraRed Imaging Spectrometer) dataset.
Experiments on simulated data
We simulate a spectral image f^{∗}composed of 8100 spectra, where each of them is either drawn from a dictionary
For a fixed τ=0.1 and ε=0.1, we evaluate the performance of the detector as the number of measurements K increases under the AK and AU cases respectively, by comparing the pseudoROC (receiver operating characteristic) curves obtained by plotting the empirical FDR against 1−FNR, where FNR is the false nondiscovery rate. Note that 1−FNR is the expected ratio of the number of null hypotheses that are correctly rejected to the number of declared null hypotheses. The empirical FDR and FNR are computed according to
where p_{t}is the pvalue threshold such that the BH procedure rejects all null hypotheses for which
p_{i}≤p_{t}, and the ground truth label
We compare the performance of our method to a generalized likelihood ratio test (GLRT)based
procedure operating on downsampled data, where we collect measurements of the form
in (23) and
for i=1,…,M, where η is a userspecified threshold [26]. While our anomaly detection method is designed to control the FDR below a userspecified threshold, the GLRTbased method is designed to increase the probability of detection while keeping the probability of false alarm as low as possible. To facilitate a fair evaluation of these methods, we compare the pseudoROC curves (FDR versus 1−FNR) and the actual ROC curves (probability of false alarm p_{f}versus probability of detection p_{d}) corresponding to these methods obtained by averaging the empirical FDR, FNR, p_{d} and p_{f} over 1,000 different noise and sensing matrix realizations for different values of K. We also compare the performance of the proposed method when Φis chosen according to (24) and when it is chosen at random, as discussed in the previous section. Figure 2a,e show the pseudoROC plots and the conventional ROC plots obtained using the GLRTbased method operating on downsampled data when {α_{i}} are known. Figure 2b,f show the results obtained by using a random Gaussian Φ instead of the Φ in (24). Figure 2c,g show the pseudoROC plots and the conventional ROC plots obtained using our method when {α_{i}} are known. These plots show that performing anomaly detection from our designed projection measurements yields better results than performing anomaly detection on downsampled measurements and on measurements obtained using a random Gaussian Φ. This is largely due to the fact that carefully chosen projection measurements preserve distances (up to a constant factor) among pairs of vectors in a finite collection, where as the downsampled measurements fail to preserve distances among vectors that are very similar to each other. Similarly, a random projection matrix Φ is not necessarily distancepreserving postwhitening transformation, which leads to poor performance as illustrated in Figure 2b,f. Figure 2d,h shows the pseudoROC plots and the conventional ROC plots obtained using our method when {α_{i}} are unknown, and are estimated from the measurements. Note that the value of ζ decreases as K increases since the estimation accuracy of {α_{i}} increases with increase in K. These plots show that the performance improves as we collect more observations, and that, as expected, the performance under the AK case is better than the performance under the AU case.
Figure 2. Comparison of the performances of the proposed anomaly detector using a random Φ, the proposed anomaly detector using the designed Φin (24) and the GLRTbased method operating on downsampled data for different values of
Kwhen
Experiments on real AVIRIS data
To test the performance of our anomaly detector on a real dataset, we consider the
unlabeled AVIRIS Jasper Ridge dataset
Figure 3. Anomaly detection results corresponding to real AVIRIS data for a fixed FDR control of 0.01.
We generate measurements of the form
Conclusion
This work presents computationally efficient approaches for detecting known targets and anomalies of different strengths from projection measurements without performing a complete reconstruction of the underlying signals, and offers theoretical bounds on the worstcase target detector performance. This article treats each signal as independent of its spatial or temporal neighbors. This assumption is reasonable in many contexts, especially when the spatial or temporal resolution is low relative to the spatial homogeneity of the environment or the pace with which a scene changes. However, emerging technologies in computational optical systems continue to improve the resolution of spectral imagers. In our future work we will build upon the methods that we have discussed here to exploit the spatial or temporal correlations in the data.
Appendix 1: Proof of Theorem 1
Using linear algebra and matrix theory, it is possible to show that if B=I−AΣ_{b}A^{T} is positive definite, then
satisfies (3).^{c} In particular, we can substitute (24) in (3) to verify that the proposed construction of Φ satisfies (3). Observe that C_{Φ}=(ΦΣ_{b}Φ^{T} + σ^{2}I)^{−1/2} can be written in terms of (24) as follows:
where the thirdtolast equation follows from the definition of B and (25) follows from the fact that B is symmetric and positive definite. If B is positive definite, then B^{−1} is positive definite as well and can be decomposed as B^{−1}=(B^{−1/2})^{T}B^{−1/2}, where the matrix square root B^{−1/2}is symmetric and positive definite. By substituting (25) and (24) in (3), we have C_{Φ}Φ=σ^{−1}B^{1/2}σB^{−1/2}A=A. A sufficient condition for Bto be positive definite can be derived as follows.
To ensure positive definiteness of B, we must have
for any nonzero
since ∥A∥=∥A^{T}∥ and ∥Σ_{b}∥=λ_{max}, where λ_{max} is the largest eigenvalue of Σ_{b}. To ensure ∥AΣ_{b}A^{T}∥ < 1, ∥A∥^{2}λ_{max} has to be < 1, which leads to the result of Theorem 1.
Appendix 2: Proof of Theorem 2
The proof of Theorem 2 adapts the proof techniques from [48] to nonidentical independent hypothesis tests. We begin by expanding the pFDR definition in (8) as follows:
Observe that R(Γ)=k implies that there exists some subset S_{k}={u_{1},…,u_{k}}⊆{1,…,M} of size k such that
By plugging in the definition of V({Γ_{i}}) from (9), we have
for all u_{ℓ}∈S_{k} since the tests are independent of each other given A. The posterior probability
where
The denominator term in (29) can be expanded as follows:
Observe that
Thus
Substituting (30) and (31) in (29),
By substituting (32) in (27) and (28) we have:
since
where p_{max}=max_{ℓ∈{1,…,m}}p^{(ℓ)}.
Appendix 3: Proof of Theorem 3
The proof is via a random selection technique, similar to random coding arguments common in information theory. Specifically, we will draw a K×N sensing matrix A at random from a particular distribution and then show that, for ε, N, and K satisfying the conditions of the theorem, the probability that the conclusions of the theorem will fail to hold for this randomly chosen Ais strictly smaller than unity. This will imply that the conclusions of the theorem must be true for at least one (deterministic) realization of A.
We begin by specifying all the relevant random variables:
•
•
• Gis a random K×Nmatrix with i.i.d.
We assume that
where α_{1},…,α_{M} > 0 are the given signal strengths.
We first consider the case when α_{1}=⋯=α_{M}=α. Given ε, N, and K, we define the following two error events:
where, for each i∈{1,…,M},
We will now prove that
The union bound gives
Letting
Next, we bound
Lemma 1 (Compressive classification error)
Consider the problem of classifying a signal of interest
where the probability is taken with respect to the distributions underlying f^{∗}, A, and n.
Using the above lemma, we have
Combining (36) and (37), we get (35).
Because of (13a), the righthand side of (35) is less than 1−ε−p_{max}, which is strictly positive by hypothesis. Thus, from the fact that
and from (34), it follows that there exists at least one deterministic choice of the K×N sensing matrix A^{∗}, such that:
where, for a given choice of A, (P_{e})_{max}(A) denotes the maximum probability of error defined in Theorem 2.
Next, from (38a) and (13b) it follows that A^{∗}satisfies the conditions of Theorem 1. Finally, we use (11) to bound the worstcase
pFDR achievable with A^{∗}. First of all, we note that the function
Let us choose
Then from (13a) we have x + h≤1−ε−p_{max} < 1−p_{max}, and from (13c) we have x + h≥0. Hence, using (39) and simplifying, we obtain the bound
This proves the theorem for the case α_{1}=⋯=α_{M}=α.
To handle the case when the α_{i}’s are distinct, we simply let
and replace the definition of the error event
which follows from the following argument. First of all, we can replace the observation model with the equivalent model
where
Appendix 4: Proof of Theorem 4
We first prove this theorem assuming that {α_{i}} are known and later extend to the case where
Note that
with high probability. Thus,
since
When {α_{i}} are estimated from the observations such that
where (42) is due to the distance preservation property of A given in (18). Observe that
where thirdtolast equation is due to the triangle inequality, secondtolast equation
comes from the assumption that
Endnotes
^{a} Note that τ cannot exceed
Competing interest
The authors declare that they have no competing interests.
Acknowledgements
This work was supported by the NSF Award No. DMS0811062, DARPA Grant No. HR00110910036, and AFRL Grant No. FA865007D1221.
References

EJ Candès, T Tao, Nearoptimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inf. Theory 52(12), 5406–5425 (2006)

D Donoho, Compressed sensing. IEEE Trans. Info. Theory 52(4), 1289–1306 (2006)

M Davenport, M Duarte, M Wakin, J Laska, D Takhar, K Kelly, R Baraniuk, The smashed filter for compressive classification and target recognition. Proceedings of SPIE, vol. 6498 ((San Jose, CA, 2007), pp), . 142–153

MF Duarte, MA Davenport, MB Wakin, RG Baraniuk, Sparse signal detection from incoherent projections. IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 3 ((Toulouse, France, 2006), pp), . 305–308

J Haupt, R Castro, R Nowak, G Fudge, A Yeh, in Compressive sampling for signal classification. Fortieth Asilomar Conference on Signals, Systems and Computers, (2006, pp), . 1430–1434

S Aeron, V Saligrama, M Zhao, Information theoretic bounds for compressed sensing. Inf. IEEE Trans. Theory 56(10), 5111–5130 (2010)

E AriasCastro, Y Eldar, Noise folding in compressed sensing. IEEE Signal Process. Lett 18, 478–481 (2011)

J Han, B Bhanu, Fusion of color and infrared video for moving human detection. Pattern Recogn 40(6), 1771–1784 (2007). Publisher Full Text

W Johnson, D Wilson, W Fink, M Humayun, G Bearman, Snapshot hyperspectral imaging in ophthalmology. J. Biomed. Optics 12(1), 014036–1–0140367 (2007). Publisher Full Text

R Lin, B Dennis, A Benz, in The Reuven Ramaty HighEnergy Solar Spectrscopic Imager (RHESSI)  Mission Description and Early Results (Kluwer Academic Publishers, Dordrecht, 2003)

M Martin, S Newman, J Aber, R Congalton, Determining forest species composition using high spectral resolution remote sensing data. Remote Sens. Envir 65(3), 249–254 (1998). Publisher Full Text

M Martin, M Wabuyele, K Chen, P Kasili, M Panjehpour, M Phan, B Overholt, G Cunningham, D Wilson, R DeNovo, T VoDinh, Development of an advanced hyperspectral imaging (HSI) system with applications for cancer detection. Ann. Biomed. Eng 34(6), 1061–1068 (2006). PubMed Abstract  Publisher Full Text

J Miller, C Elvidge, B Rock, J Freemantle, An airborne perspective on vegetation phenology from the analysis of AVRIS data sets over the Jasper ridge biological preserve. Geoscience and Remote Sensing Symposium (IGARSS’90): Remote sensing for the nineties ((College Park, MD, 20–24 May 1990), pp), . 565–568

C Stellman, G Hazel, F Bucholtz, J Michalowicz, A Stocker, W Schaaf, Realtime hyperspectral detection and cuing. Opt. Eng 39, 1928–1935 (2000). Publisher Full Text

K Zuzak, S Naik, G Alexandrakis, D Hawkins, K Behbehani, E Livingston, Intraoperative bile duct visualization using nearinfrared hyperspectral video imaging. Am. J. Surg 195(4), 491–497 (2008). PubMed Abstract  Publisher Full Text

D Brady, M Gehm, Compressive imaging spectrometers using coded apertures. Proc. of SPIE, vol. 6246 ((Kissimmee, Florida, 2006), pp), . 62460A1–62460A9

RA DeVerse, RR Coifman, AC Coppi, WG Fateley, F Geshwind, RM Hammaker, S Valenti, FJ Warner, GL Davis, Application of Spatial Light Modulators for New Modalities in Spectrometry and Imaging. Spectral Imaging: Instrumentation, Applications, and Analysis II, vol. 4959, ed. by RM Levenson, GH Bearman, A MahadevanJansen ((2003), pp), . 12–22

M Gehm, R John, D Brady, R Willett, T Schulz, Singleshot compressive spectral imaging with a dualdisperser architecture. Opt. Express 15(21), 14013–14027 (2007). PubMed Abstract  Publisher Full Text

D Takhar, J Laska, MB Wakin, MF Duarte, D Baron, S Sarvotham, K Kelly, RG Baraniuk, A new compressive imaging camera architecture using opticaldomain compression. Proc. IS&T/SPIE Symposium on Electronic Imaging ((San Jose, CA, 2006), pp), . 43–52 PubMed Abstract  Publisher Full Text

A Wagadarikar, R John, R Willett, D Brady, Single disperser design for coded aperture snapshot spectral imaging. Appl. Opt 47(10), B44–B51 (2008). PubMed Abstract  Publisher Full Text

F Woolfe, M Maggioni, G Davis, F Warner, R Coifman, S Zucker, Hyperspectral microscopic discrimination between normal and cancerous colon biopsies Manuscript (2006)

D Manolakis, D Marden, G Shaw, Hyperspectral image processing for automatic target detection applications. Lincoln Laboratory J 14(1), 79–116 (2003)

G Wei, L Agnihotri, N Dimitrova, TV program classification based on face and text processing. 2000 IEEE International Conference on Multimedia and Expo, ICME 2000, vol. 3 ((2000), pp), . 1345–1348

AO Hero, Geometric entropy minimization (GEM) for anomaly detection and localization. Proc. Advances in Neural Information Processing Systems (NIPS) ((MIT Press, Vancouver, Canada, 2006), pp), . 585–592

I Steinwart, D Hush, C Scovel, A classification framework for anomaly detection. J. Mach. Learn. Res 6, 211–232 (2005)

D Stein, S Beaven, L Hoff, E Winter, A Schaum, A Stocker, Anomaly detection from hyperspectral imagery. IEEE Signal Process. Mag 19(1), 58–69 (2002). Publisher Full Text

D Manolakis, G Shaw, Detection algorithms for hyperspectral imaging applications. IEEE Signal Process. Mag 19(1), 29–43 (2002). Publisher Full Text

JO Berger, in Statistical Decision Theory and Bayesian Analysis, (Springer, New York, 1985)

Y Benjamini, Y Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B (Methodological) 57(1), 289–300 (1995)

X Jin, S Paswaters, H Cline, A comparative study of target detection algorithms for hyperspectral imagery. Proceedings of SPIE, vol. 7334 ((2009), p), . 73341W

E Kelly, An adaptive detection algorithm. IEEE Trans. Aerospace Electron. Syst. AES 22(2), 115–127 (1986)

S Kraut, L Scharf, L McWhorter, Adaptive subspace detectors. IEEE Trans. Signal Processing 49(1), 1–16 (2001). Publisher Full Text

L Scharf, B Friedlander, Matched subspace detectors. IEEE Trans. Signal Process 42(8), 2146–2157 (1994). Publisher Full Text

H Kwon, N Nasrabadi, Kernel matched subspace detectors for hyperspectral target detection. IEEE Trans. Pattern Anal. Mach. Intell 28(2), 178–194 (2006). PubMed Abstract  Publisher Full Text

LL Scharf, LT McWhorter, Adaptive matched subspace detectors and adaptive coherence estimators. Conference Record of the Thirtieth Asilomar Conference on Signals, Systems and Computers ((Pacific Grove, CA, 1996), pp), . 1114–1117

M Parmar, S Lansel, B Wandell, Spatiospectral reconstruction of the multispectral datacube using sparse recovery. 15th IEEE International Conference on Image Processing ((San Diego, CA, 2008), pp), . 473–476

R Willett, M Gehm, D Brady, Multiscale reconstruction for computational spectral imaging. Comput. Imag. V 6498, 64980L–1–64980L15 (2007)

J Fowler, Q Du, Anomaly detection and reconstruction from random projections. IEEE Trans. Image Process 21(1), 184–195 (2012). PubMed Abstract  Publisher Full Text

I Reed, X Yu, Adaptive multipleband CFAR detection of an optical pattern with unknown spectral distribution. IEEE Trans. Acoust. Speech Signal Process 38(10), 1760–1770 (1990). Publisher Full Text

K Krishnamurthy, M Raginsky, R Willett, Hyperspectral target detection from incoherent projections. IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) ((Dallas, TX, 2010), pp), . 3550–3553

K Krishnamurthy, M Raginsky, R Willett, Hyperspectral target detection from incoherent projections: nonequiprobable targets and inhomogenous SNR. 17th IEEE International Conference on Image Processing (ICIP) ((Hongkong, 2010), pp), . 1357–1360

JW Boardman, in Spectral Angle Mapping: A Rapid Measure of Spectral Similarity (AVIRIS, 1993)

Z Guo, S Osher, Template matching via L1 minimization and its application to hyperspectral data Accepted to Inverse Problems and Imaging (IPI), 2009

H Kwon, N Nasrabadi, Kernel RXalgorithm: a nonlinear anomaly detector for hyperspectral imagery. IEEE Trans. Geosci. Remote Sens 43(2), 388–397 (2005)

A Szlam, Z Guo, S Osher, A split Bregman method for nonnegative sparsity penalized least squares with applications to hyperspectral demixing. IEEE 17th International Conference on Image Processing (ICIP) ((Hongkong, 2010), pp), . 1917–1920

C Chang, Virtual dimensionality for hyperspectral imagery. SPIE Newsroom 10(2.1200909), 1749 (2009)

C Chang, Q Du, Estimation of number of spectrally distinct signal sources in hyperspectral imagery. IEEE Trans. Geosci. Remote Sens 42(3), 608–619 (2004). Publisher Full Text

J Storey, The positive false discovery rate: a Bayesian interpretation and the qvalue. Ann. Stat, 2013–2035 (2003)

W Johnson, J Lindenstrauss, Extensions of Lipschitz maps into a Hilbert space. Contemp. Math 26, 189–206 (1984)

G Healey, D Slater, Models and methods for automated material identification in hyperspectral imagery acquired under unknown illumination and atmospheric conditions. IEEE Trans. Geosci. Remote Sens 37(6), 2706–2717 (1999). Publisher Full Text

D Achlioptas, Databasefriendly random projections. Proc. 20th ACM Symp. Principles of Database Systems ((ACM Press, New York, NY 2001), pp), . 274–281

R Baraniuk, M Davenport, R DeVore, M Wakin, A simple proof of the restricted isometry property for random matrices. Constructive Approx 28(3), 253–263 (2008). Publisher Full Text

F Krahmer, R Ward, New and improved johnsonlindenstrauss embeddings via the restricted isometry property. SIAM Journal on Mathematical Analysis 43(3), 1269–1281 (Arxiv preprint arXiv:1009, 2011), . 0744, 2010 Publisher Full Text

L Wasserman, in All of Statistics: A Concise Course in Statistical Inference (Springer, New York, NY 2004)

T Tao, V Vu, On random±1 matrices: singularity and determinant. Random Struct. Algor 28(1), 1–23 (2006). Publisher Full Text

T Tao, Talagrand’s concentration inequality, (http://terrytao), . wordpress.com/2009/06/09/talagrandsconcentrationinequality/ webcite. Accessed on 08/03/2012

FA Kruse, JW Boardman, AB Lefkoff, JM Young, KS KiereinYoung, TD Cocks, R Jensen, PA Cocks, HyMap: an Australian hyperspectral sensor solving global problemsresults from USA HyMap data acquisitions. Proc. of the 10th Australasian Remote Sensing and Photogrammetry Conference ((Adelaide, Australia, 2000), pp), . 18–23

KR Davidson, SJ Szarek, Local operator theory, random matrices and Banach spaces. ((NorthHolland, Amsterdam, 2001), pp), . 317–366