Abstract
In this article, we investigate the limiting spectral distribution of the sample covariance matrix (SCM) of weighted/windowed complex data. We use recent advances in random matrix theory and describe the distribution of eigenvalues of the doubly correlated Wishart matrices. We obtain an approximation for the spectral distribution of the SCM obtained from windowed data. We also determine a condition on the coefficients of the window, under which the fragmentation of the support of noise eigenvalues can be avoided, in the noiseonly data case. For the commonly used exponential window, we derive an explicit expression for the l.s.d of the noiseonly data. In addition, we present a method to identify the support of eigenvalues in the general case of signalplusnoise. Simulations are performed to support our theoretical claims. The results of this article can be directly employed in many applications working with windowed array data such as source enumeration and subspace tracking algorithms.
Introduction
The distribution of the eigenvalues of the sample covariance matrix (SCM) of data has important impact on the performance of signal processing algorithms. Over the last decade, the properties of complex Wishart matrices are used in the analysis and design of many signal processing algorithms such as in array processing. Our knowledge about the distribution of eigenvalues, eigenvectors and determinants of complex Wishart matrices and their limiting behavior is emerging as a key tool in a number of applications, e.g., in data compression and analysis of wireless MIMO channels [1,2], array processing, source enumeration and identification [35], adaptive algorithms [6,7]. The densities of the singular values of random matrices and their asymptotic behavior (as the matrix size tends to infinity) has been employed in some applications [810]. The eigenvalues of the SCM are often used to describe many signal processing problems. For example in [8], they are used as sufficient statistics for array source enumeration.
Let X_{1}, …, X_{N} be N independent zero mean Gaussian random vectors with covariance matrix of A, i.e., , where A is a nonnegative M × M Hermitian matrix. The SCM R_{N} is defined as , where X = [X_{1},…,X_{N}] contains N snapshots of the received data. In this article, we refer to this SCM as the SCM with rectangular window (SCMR) as all data samples have equal weights, i.e., a rectangular window is used. In this case R_{N} has a Wishart distribution [11] and for more than four decades, it has been known that the joint probability density function (PDF) of its eigenvalues, can be expressed in terms of hypergeometric functions [12]. More recently, a simpler form of this joint PDF was derived in terms of the product of two determinants [13]. However, this form is applicable if the array is small and the eigenvalues of the covariance matrix of the observed data are distinct. Several articles have investigated the behavior of the eigenvalues of R_{N} when M, N → ∞ assuming [14,15]. This is a more realistic assumption than assuming M is finite and N is infinite, because in most practical applications the covariance matrix A slowly varies, hence, the effective window length could not be arbitrary long. For instance, the eigenvalue estimators that are consistent in this asymptotic regime are more robust to finite sample size than other estimators which are only guaranteed to converge for fixed M and N→∞[9]. There are many works on the distribution of eigenvalues in this asymptotic regime, such as informationplusnoise [16] and spiked models where all eigenvalues are equal excluding a small number of fixed eigenvalues (spikes) [17]. Specifically, the distribution of the largest noise eigenvalue is widely studied [18,19].
Some signal processing algorithms process a batch of data together and deal with the SCMR. In addition, the existing results in literature about the behavior of the eigenvalues mainly consider the rectangular window. However in a number of practical signal processing algorithms, the SCM is estimated by applying a window as follows
where {w_{i} ≥ 0,i = 1,…,N} is a nonnegative sequence. Hereafter, we refer to R_{N} as the SCM. The SCMR is obtained using a rectangular window, i.e., where w_{i} is nonzero and constant for i = 1,…,N. These weights allow to flexibly emphasize or deemphasize some of the observations. For example smaller weights for old data samples allows to improve the agility of the algorithms. For instance in cognitive radio, it is important to detect the activities of users and the idle channels as fast as possible, thereby reducing the detection time and improving the agility of the system [20,21]. Among all windows, the exponential window, w_{i} = w_{0}p^{i}, is commonly used. Two reasons for this popularity are (1) this window allows to develop fast recursive algorithms which are considerably less expensive in terms of computational complexity, thereby facilitate the realtime implementation of these algorithms (e.g. see [22,23]) and (2) allows to forget the old data, thereby improving the tracking ability in nonstationary environments. For instance exponentially windowed data is used in most of the existing subspace tracking algorithms[24,25]. That is because only a rankone update is required for each new data vector to update the underlying SCM, which leads to simple low cost subspace tracking algorithms.
In this article, we study the effects of windowing on the distribution of the eigenvalues of the SCM. In this case, the SCM in (1) has a doubly correlated Wishart distribution [2630]. We must note that, there are numerous research results for the case of Wishart matrices, however, the spectral properties in the doubly correlated case has not been sufficiently studied.
Manipulating the joint PDF of the eigenvalues which is a very complex function is not practical, particularly for large matrices. An alternative approach used in the literature, is to employ the following empirical spectral distribution (e.s.d.) of a square matrix
where λ_{1}, λ_{2}, … ,λ_{M} are eigenvalues of A and #{.} denotes the cardinality of a set. Note that, in this definition all eigenvalues of A are assumed to be real. Although this formulation is less explicit than the joint PDF of eigenvalues, it describes the statistical behavior of the eigenvalues. In many practical cases A is a random matrix and the e.s.d. F^{A}(x) is a random function which converges almost surly to a deterministic cumulative distribution function as the dimension of the system grows. In such cases, lim M → ∞F^{A}(x) is referred to as the limiting spectral distribution (l.s.d.) of A.
In recent years, some results have been obtained on the limiting behavior of the e.s.d. of correlated Wishart matrices. In this article, for the white noise case, we study the behavior of eigenvalues of the SCM. In particular for the exponential window, we extend the results previously demonstrated in [31] and give more details along with the proofs of the required theorems. We then consider the case of signal plus noise and present a method to determine the support of eigenvalues. The main contributions in this article are
• A method is proposed to approximate the spectral distribution of the SCM using arbitrary windows with that of an equivalent Wishart Distribution. For the especial case of white noise (noise only), this approximation is the Marchenko–Pastur (M–P) distribution, which is the known distribution for the case of a rectangular window.
• In Theorem 2, we derive an accurate and explicit equation for the l.s.d. of the SCM of noiseonly data for the exponential window. Many simulations are performed to show the accuracy of this l.s.d.
• In Theorem 3 we present a systematic method to compute the support of eigenvalues in the signal plus noise data case using an exponentially weighted window. In addition to the results, we follow up a different and novel approach in proving this theorem compared with the existing proof for the rectangular window case where the Stieltjes transform m(z) has the explicit inverse [15]. This approach can be easily utilized for other window types where the Stieltjes transform is expressed explicitly or implicitly as a function of z.
The demonstrated results provide a key step toward characterization of the distribution of eigenvalues in the general Covariance matrix of windowed data. The results of this work are useful in the design and implementation of robust algorithms using windowed snapshots. Our derivations in Theorems 2 and 3 can be directly used to design unbiased eigenvalue and eigenvector estimators. These estimators are important especially because the exponential window is used in numerous applications. They can be used as a basis to improve the performance and accuracy of many existing algorithms which are based on exponentially windowed data, in many fields such as subspace tracking, DOA estimation and source enumeration.
The remainder of this article is organized as follows: Section 2 introduces the system model and some important mathematical tools. We derive an approximation for the Stieltjes transform of l.s.d. of eigenvalues of weighted windowed array data in Section 3. Asymptotic spectrum of the eigenvalues in noiseonly data case is analyzed in Section 4. The signal plus noise case is studied in Section 5. Section 6 provides simulation results. Finally, we conclude this work and suggest future works in Section 7.
2 System model for windowed SCM
We assume that in (1) is a circularly symmetrical independent Gaussian random vector process with zero mean and covariance matrix of , i.e., . In this case, we can rewrite (1), supposing that SCM is estimated using a window of size N with positive coefficients w_{1},…,w_{N},
where U = [U_{1},…,U_{N}] is an M × N matrix contains i.i.d. zeromean unitvariance complex Gaussian entries and . The matrix R_{N} has a doubly correlated Wishart distribution. In practice, it is very complex to directly characterize the e.s.d. of R_{N} thus, we use the Stieltjes transform of this distribution and indirectly characterize the behavior of the eigenvalues. Then, in the asymptotic regime as M,N→∞ given , the inverse transform of the limit gives the l.s.d of SCM.
Definition 1
[15]Stieltjes transformm(z),of a distribution functionF^{R}(x) is defined as
The inverse Stieltjes transform formula is as follows:
Hence, in order to characterize the asymptotic distribution of the sample eigenvalues, we alternatively characterize the asymptotic behavior of the corresponding Stieltjes transform, and then use the Stieltjes inversion formula in (5) to obtain l.s.d. of SCM f^{R}(x). We use the following theorem which gives the Stieltjes transform of the correlated Wishart matrix [29] and is the basis for derivations in this article.
Theorem 1
For a finite length window with length ofN, consider the matrix defined by. Assume that all elements ofare i.i.d. random variables with zeromean, unit variance and finite E {U_{ij}^{4}}. In addition, suppose thatis a Hermitian nonnegative definite matrix,W_{N} = diag(w_{1}, …, w_{N}), , whenM,N → ∞ with . In this case, the empirical distribution, with probability 1, converges weakly to a probability distribution functionF^{R}whose Stieltjes transform m(z), for, is given by
where e(z) is the unique solution of the following equation in
Proof 1
See[29]for proof. Similar results are also demonstrated in[26], [28]with some differences in the assumptions on correlation matrices.□
We emphasize that (6) and (7) give the exact distribution in the asymptotic regime as M,N → ∞ with . Since in practice, the array dimension and/or sample size are usually finite numbers, this method gives a deterministic approximation for the actual sample eigenvalue distribution.
To show how this method works, we now consider the simplest case (where the distribution is well known) using a rectangular window and white Gaussian noise, i.e., W = I_{N × N} and A = σ^{2}I_{M × M}. In this case, we have dF^{W}(w) = δ(w − 1)dw and dF^{A}(x) = δ(x − σ^{2})dx, where δ(x) is the Dirac delta function. Thus with straightforward manipulations of (6) and (7), the Stieltjes transform is found to be the solution of
In this case, as expected the e.s.d. of the SCMR, , converges to the M–P distribution [14] as follows,
Now, let us consider an arbitrary window and white noise A = σ^{2}I_{M×M}. In this case from (6), (7) and dF^{A}(x) = δ(x − σ^{2})dx, we obtain . Thus, from (6), we obtain
3 Effective length of a window
In this section, we define the effective length of a window which allows to approximate the distribution of the eigenvalues of windowed SCM with that of a rectangular window with an equivalent length, assuming that the covariance matrix of data A satisfies the assumptions of Theorem 1. In several existing articles some intuitive equivalent length are defined simply to extend the previously existing results for the rectangular case in order to analyze the behavior of the eigenvalues in the weighted window cases [22], [23].
Consider a window w_{i}>0 of length N and denote with a converging distribution, i.e., . We assume that the sample size N is much larger than the array dimension M, i.e., c is small. It is known that m(z) is bounded for z ∈ C^{+}[15]. Thus for 0 < c ≪ 1 we have 0 < cσ^{2}m sup w < β < 1 where β is some constant number. It is easy to show that^{a}, we have . This yields
Since 0 < c ≪ 1, for I = 2 and defining and w_{e}=E{w} as the effective parameters, we can rewrite (11) as
where using E{w^{3}} < sup{w^{2}}E{w} and E{w^{2}} < sup{w}E{w} it is easy to show that the approximation error is bounded by .
Definition 2
The expression (12) represents the M–P distribution as in (8) for a rectangular window of length
with all coefficients equal tow_{e}. The average weightw_{e}is a scale parameter for the eigenvalues of covariance matrix of the received data. Although we have derived the effective length for the noise only data, our results reveal that this effective window length gives accurate results for the signal plus noise case.
For the white noise data, the l.s.d. of SCM can be approximated by the M–P distribution defined in (9) by substituting c and σ^{2}, with c_{e} and w_{e}σ^{2}, respectively. Note that the effective window length is always smaller than the number of samples N. This approximation can be intuitively interpreted as a Wishart approximation where the effect of “windowing” is approximated with a rectangular window with an effective number of samples of N_{e} and the covariance matrix of the received data is scaled to A_{e}=w_{e}A”.
Now, we compute the effective length of the triangular and the exponential windows. A triangular window is defined by for i = 1,…,N ≫ 1 and has the average weight of . Using (13), the effective length of the triangular window is
The exponential window is very popular in signal processing applications due to its simple implementation and is defined by w_{i}=w_{0}p^{i} for i=1,2,…, where p∈(0,1) and w_{0} is a normalization constant. We note that the exponential window is inherently an infinite length window. Interestingly, in Theorem 1 the window length and the array dimension jointly tend to infinity where . Here for a finite array dimension M, we first approximate the exponential window with N coefficients, which is only accurate if N is large enough such that the omitted coefficients are negligible. Asymptotically as M,N jointly tend to ∞, the results from this truncated window become accurate for describing the underlying distributions using the exponential window. In this case, with some calculations we obtain . Thus the effective length of the exponential window (for N≫1) becomes
which is not a function of N. As expected the effective length of the window increases as the forgetting factor p approaches one.
4 Spectral analysis of noiseonly data
In this section, for the windowed data case, the l.s.d. of the SCM is characterized more accurately. In practice, the array dimension and the effective window length are both finite. However, we are interested in the impact of the weights of the window f^{W}(w), on the limiting distribution of the eigenvalues as M,N→∞ employing Theorem 1. We use two approaches to model f^{W}(w), Discrete and Continuous. The former considers f^{W}(w) as a finite sum of discrete masses at the coefficients of the window. The discontinuous distribution function modeling is useful to analyze the support of eigenvalues and its connectivity. The latter approach, approximates f^{W}(w) as a continuous function allowing to derive some explicit equations for the Stieltjes transform.
Let S_{F} denote the support of the function F^{R}(x) and shows its complement. From (5), we see that S_{F} consists of points on the real axis where the imaginary part of m(z) is positive, i.e., the support is the union of some subintervals. Thus to find the support of the distribution of eigenvalues, we must determine such intervals. In [15], it is shown that limy→0^{+}m_{F}(x+iy) exists for all x≠0, and therefore we can define
The following lemma is the key to determine these intervals on real axis [15].
Lemma 1 ([32], Lemma 6.1)
For any c.d.f.F, letS_{F}denote its support andbe the complement ofS_{F}. For,m=m_{F}(x) is the only real solution ofx = z (m) which satisfies
wherez(m) is the inverse function ofm(z). Also conversely, for any realmin the domain ofz(m) ifthenx = z (m) is outside the support ofF.
This simply means that the support S_{F}, is the union of intervals on the vertical axis where z(m) is increasing for real values of m. According to (10), for noise only data z(m) can be written as follows
4.1 Discrete distribution function approach
Suppose the window consists of N_{d} distinct weights w_{i},i=1,…,N_{d}, each with multiplicity . Therefore as (M,N→∞ and ), we can evaluate (18) in terms of the weights
Figure 1, represent a typical case of the function on the righthand side of (19). Lemma 1 states that the support of the distribution of eigenvalues is the complement of the set of all values x ∈ R^{+} for which x = z(m) is increasing for real values of m, i.e., . The function z(m) has poles at m=0, . In addition, z(m) is an analytic function and we have
Figure 1. A typical representation of the function z(m) in (19) versusfor c < 1 in the case where the support of eigenvalues is a connected interval.
For c < 1, i.e., where the length of the window is more than the array dimension, , thus as Figure 1 shows, has at least two solutions which we denote them and m_{l}∈(0,∞). For c > 1, as Figure 2 shows typically, from lim_{m}→±∞z(m) = 0^{∓} we conclude that m_{l} must be in . We must note that, for c > 1, the SCM has M − N zero eigenvalues expressed with a probability mass of in the l.s.d. of SCM, that is not counted as a cluster in these derivations, i.e. for c >, the PDF of the distribution includes a term of . If the weights are widely separated, the support of eigenvalues may become fragmented into union of a number of disjoint intervals.
Figure 2. A typical representation of the function z(m) in (19) versusfor c > 1 in the case where the support of eigenvalues is fragmented into two disjoint intervals.
In many signal processing applications the white noise subspace is separated from the signal subspace based on the eigenvalues of the SCM. Such a fragmentation of the support of noise eigenvalues misleads the subspace based algorithms and leads to noise eigenvalues to be mistaken as signal ones.
In fact, it is desirable that the support of eigenvalues be as compact as possible. To avoid such an undesirable fragmentation, the equation should not have real solution for , i.e.,
Under this connectivity condition, the support of eigenvalues is the interval [x_{l} = z(m_{l}),x_{u}=z(m_{u})], which can be calculated, numerically. Our simulations show that this condition is satisfied for popular window types especially for N_{d} ≫ 1 used in practice. Figure 2 shows a typical case for c > 1 where has an even number of realvalued solutions (counting multiplicities) which we denote them by (in addition to m_{l},m_{u}). Each pair of these solutions determines a subinterval for the support of eigenvalues, i.e., we have , where . Reducing c or reducing the gap between weight values {w_{i}} makes the support more compact at the expense of using more temporal samples.
4.2 Continuous function approach
The goal of this approach is to find closed form expressions of Stieltjes integrals of the l.s.d. This approach could be used for any window shapes. However, we start with the triangular window and then consider the exponential window which are more popular. Here, we model the function f^{W}(w) with a continuous distribution and evaluate (18) to found the Stieltjes transform.
For a triangular window , we have where U(w) is the unit step function. In this case it is easy to show that converges to a uniform distribution as N increases, i.e.,
Substituting f^{W}(w) in (18), we get
Again, we first use Lemma 1 and determine the support of eigenvalues (by plotting z(m) for real m and finding the intervals on the vertical axis where z(m) is not increasing). Figure 3 plots the lower and upper boundaries of support of eigenvalues for a triangular window for different values of c. It can be seen that the discrete distribution (assuming N_{d}=50) and the continuous approach result in almost the same boundaries. We also observe that these boundaries are close to those obtained by the Wishart approximation assuming the effective window length in (14). In this figure using the rectangular window with same length as the triangular window, the distribution is referred to as the M–P distribution. Also we observe that the eigenvalues tend to more concentrate around their real value σ^{2}=1 as the window length increases. In addition from this figure, we conclude that the support using the triangular window is looser than than that of the rectangular window for a given value of c, because the effective length of the triangular window is less than that of a rectangular window.
Figure 3. Upper (values on the right) and Lower (values on the right) boundaries of the support of eigenvalues using the triangular window versus c .
For the exponential window, first we introduce the new parameter γ as the ratio of smallest to largest weights of the truncated exponential window. The coefficients of the window can be redefined as as a function of γ as . Therefore from , from , it is easy to show that
where ⌊.⌋ is the floor function. This increasing staircase function takes values on . To satisfy the constraints of Theorem 1 for the exponential window, we assume that the ratio of smallest to largest weights of the window, γ=p^{N} > 0, is an arbitrary small real constant. In other words, the forgetting factor of the window approaches to 1, as M,N→∞. The smaller γ, the better this truncated exponential model fits the exponential window with the forgetting factor p. From, , we conclude that where
is a continuous function, independent of window size N and satisfies the assumptions of Theorem 1. Thus, this theorem is applicable to the exponential window truncated at some large integer N.
Substituting f^{W}(w) in (18), in the asymptotic regime of Theorem 1 as γ→0, such that , z(m) satisfies
One can use the same method as in the discrete distribution function approach and identify the support of the distribution S_{F}. However, the function z(m) in (26) is simple and the following theorem gives the explicit distribution.
Theorem 2
For the exponentially weighted window, the l.s.d. of SCM,f^{R}(x), is given by
and upper and lower boundaries of the support are
respectively, whereω_{k}(x) is the branch of Lambert W function^{b}[33]withk=−1 andk=0.
Proof 2
According to the Lemma 1, boundaries of the support of eigenvalues are the real solutions ofz^{′}(m) = 0, i.e., with some simple calculations, are the solutions of
Denotingy= ln(1+c_{0}σ^{2}m)−c_{0}−1, we obtain
This equation has two real solutionsm_{− }and m_{+ }expressed using Lambert W function as
Using (26), the boundariesz(m_{−}) andz(m_{+}) are obtained as in (28a) and (28b) which determine the support of eigenvalues as the interval.
To obtain the l.s.d. of SCM, we should findm(z) with positive imaginary part for allz∈[z(m_{−}),z(m_{+})]. In (26), we denote
and obtain
Therefore, the solutions are
According to (16), (33), for the values ofzin the interval of the obtained support on the real axis, due to the properties of the complex logarithm function, the imaginary part ofvis in [−Π,Π], thus only the branches withk = 0 andk = − 1 are acceptable solutions. It is easy to see that forz ∈ [z(m_{−}),z(m_{+})], the expression on the righthand side of (34) belongs to. From (33) and properties of Lambert W function, we also deduce that Im{m} and sin(−Im{v}) have the same signs, and forthe function sin(−Im{ω_{k}(x)}) is positive fork = − 1 and is negative fork = 0. Therefore, the Stieltjes transform of the l.s.d. of SCM is obtained from (35) and (33) as
Using the inverse formula in (18), the l.s.d of SCM is
Dropping the real terms inside the brackets and applying some simplifications, we obtain (27). □
We can define a second effective window length by employing and comparing the boundaries of the support of eigenvalues in (28a) and in (9) for a rectangular window which is only in terms of σ^{2} and c. Equating the length of the supports in (9) (28a), i.e., a_{+}−a_{−} = x_{+}−x_{−}, we can find a rectangular window to match the support as same as that of the exponential window and define the length of this rectangular window as another effective length for the exponential window. In some array signal processing applications, the effective length of the exponential window has been considered to be [22], [23]. Figure 4 compares these effective lengthes in terms of the forgetting factor p and reveals that the effective window length defined in (13) gives an accurate approximation for the exponential window. We also see a large gap between the traditional approximation for the effective length in [22] and what is obtained in this article using random matrix theory.
Figure 4. Effective length of the exponential window as a function of forgetting factor p.
Remark 1
In the economic literature, other methods have been proposed to approximate the spectral density function of exponentially weighted financial covariance matrices for Portfolio Optimization[34], [35]. These methods that are used in other articles (e.g., in[36], [37]) are based on numerical calculations rather than developing some closed form expressions. Pafka et al.[34]supposed that the density of the eigenvalues is aproximated bywhereandνis the root of
In contrast to these methods for the exponential window, we derive an accurate explicit closed form expression which can be easily employed in many applications such as in signal processing and economy.
5 Spectral analysis of signal plus noise data
In this section, we consider the case of white noise plus some signal sources, i.e., where the eigenvalues of A are not equal. In the general case, let λ_{q} > ⋯ > λ_{1} >0 denote the set of q distinct eigenvalues of the covariance matrix and the multiplicity of λ_{ℓ} is denoted by k_{ℓ} (we must have ). For example suppose a real phased array communication system with q − 1 independent signals impinging on it simultaneously on the same frequency band from different directions where q < M. The smallest eigenvalue λ_{1} can be interpreted as the noise eigenvalue and other q − 1 larger eigenvalues are referred to as signal eigenvalues. In the asymptotic regime, when N,M are growing large, we assume that , where α_{ℓ},ℓ= 1, …,q are multiplicity ratios of eigenvalues. In this case the spectral distribution of the matrix A in Theorem 1 can be expressed as sum of Dirac delta functions, i.e. .
In what follows, we present an approach to determine the support of eigenvalues and also the l.s.d. of exponentially weighted SCM of signal plus noise data in the asymptotic regime. The first in determining the distribution of the eigenvalues is to determine its support on the real positive axis.
The definition of the Stieltjes transform in (4) implies that for any distribution F and real x outside the support of F, m(x) is well defined and its derivative, , is obviously real and positive. Thus, m(z) is increasing on intervals on real line outside the support of its distribution function F[15]. Therefore, the inverse function theorem proves that its inverse exists on these intervals and shall also be increasing. For the one sided correlated Wishart matrices, where the inverse of m(z) has an explicit expression, Lemma 1 shows that the converse of the above statements are also true [15], i.e. for any real m in the domain of z(m), if then x = z (m) is outside the support of the distribution. Therefore, the support of eigenvalues is a Borel subset of for which z(m) is increasing which can be determined by simply plotting the inverse function z(m) for real m. Paul and Silverstein ([29], page 2) suggested the same method for doubly correlated Wishart matrices if there exists an explicit inverse z = z(m) for the limiting Stieltjes transform m(z). Unfortunately, for nonrectangular windows, the inverse of m(z) in general has no explicit expression [29]. Fortunately, by introducing two auxiliary variables u and h in what follows for the exponential window, we found z(h) which implicitly expresses z as a function of m. Then, we prove that the same method can be extended for the exponential window case, while the main difference here is that we are able to use the implicit expressions to determine this Borel set. Although the exponential window case is studied in this article, the same approach may be used for some other window types, to determine the support of eigenvalues.
From (6) and (7) we obtain
Substituting (25) in (39), we get
Substituting dF^{A}(a) in (7) changes the integral to a summation and we obtain e(z) as
According to Theorem 1, for any , there is a unique solution e=e(z) for (41) in . In this case the Stieltjes transform m(z) is calculated from (40) as
This expression gives the implicit relation between m and z, which cannot be sorted to express m as an explicit function of z or conversely, z as a function of m. Defining the auxiliary variable/function u = c_{0} (1+zm(z)) which provides a bijective relation between e and m for all z≠0, we have
This equation reveals that the imaginary parts of u and e have the same signs. In addition since γ <1, c_{0} is real and using the properties of the complex logarithm function in (43), we deduce that u always lies in a strip of the positive complex plane where its imaginary part is less than Π, i.e., the domain of u is defined as D_{u} = {u0 < Im{u}<Π}. Equation (43) also provides a bijective relation between e and u, therefore according to Theorem 1 for any , there is a unique u∈D_{u}, satisfying
Defining the second auxiliary variable/function as
and define D_{h} as its range for all . Resorting (44), we have
Proposition 1
The auxiliary variableh, as a function ofuandz, has some interesting properties as:
(1) halways lies in the subsetfor all.
(2) forh∈D_{h}, zcan be explicitly expressed as a function ofh
(1) For any, a uniquehsatisfying (47) exists in
Proof 3
The first property can is simply implied from (46) as the imaginary part ofh and uhave the same sign. Using (45) and (46), we can easily find (47). The third property is proved as follows. The constraint in (48) is obtained from Im{u} ∈ (0,Π) and (46). According to Theorem 1, for any, there is a uniqueu∈D_{u}, satisfying (44). The unique pair (z,u) gives anhinaccording to (45). In order to prove the uniqueness ofh, suppose thath_{1}andh_{2}insatisfy (47) and (48). Thus, (46) yieldsu_{1},u_{2}∈D_{u}satisfying (44). In addition, we must haveu_{1} = u_{2}since for any, there exists a uniqueu_{1}∈D_{u}. Thus forzandu_{1} = u_{2}, (45) yields thath_{1} = h_{2}. □
Although z(h) in (47) is defined only for h∈D_{h}, it is an analytic function for all . In addition note that at the roots of .
Also using (46) and (47) we express m as a function of h as follows
for h∈D_{h}. Similar to z(h), the complex function m_{h}(h) is an analytic function for all except at the set of real values and the points where z(h) = 0.
The inverse Stieltjes transform in (5) reveals that the l.s.d. depends on the behavior of m(z) in the vicinity of the real axis, i.e. for . Proposition 1 shows that the z(h) in (47) is injective over and allows us to treat h(z) as its inverse for . To determine the range of h(z) denoted by D_{h} we can evaluate z(h) for all h in (48) and take only those values of h for which z is in . As an example Figure 5 shows this region for c_{0} = 0.2 and a covariance matrix with two distinct eigenvalues 2 and 1 with the multiplicity ratios and . The white regions in Figure 5 shows the values of h for which z(h) has negative imaginary part, and the blue parts are the values where Im {z(h)} > 0. We observe that some parts of the positive complex plane are not in the domain of z(h) as we restrict the range to .
Figure 5. The range of the function h(z) for, for the case wherec_{0} = 0.2 and the covariance matrix has 2 distinct eigenvalues 2 and 1 with the multiplicity ratiosand.
5.1 Support of eigenvalues
Theorem 3
For the exponentially weighted window defined in Theorem 2, under the assumptions of Theorem 1, the complement of support of eigenvalues, is the set of values ofx = z(h) on the vertical axis wherefor some, wherez(h) is defined in (47).
Proof 4
LetS_{F}denotes the support of the functionF^{R}(x) andshows its complement. To prove Theorem 3, first we show that for any, there exist awhere. Then, we prove the converse, i.e. iffor some, thenx=z(h) is a real number outside the support of eigenvalues.
From (5), we see thatS_{F}consists of points on the real axis where Im {m(x+iy)} tends to a positive number wheny → 0^{+}. Thus to findS_{F}, we must determine such subintervals on the real axis, or equivalently we can determineby finding the intervals on the real axis where limy → 0^{+}{m(x+iy)} is real. Consider any x_{1 }and x_{2 }such that . According to the definition of Stieltjes transform in (4), m(z) andare both real and well defined for any z ∈ (x_{1}, x_{2}). In additionis nonnegative on this interval. Thus u(z) is a real invertible function on (x_{1},x_{2}), and its inversez(u) is also real and increasing on the interval, i.e..
Lemma 2
For any given, the function h(u,z) in (45) is monotonically increasing versus.
Proof
Defining , the function h(u,z) is continuous for all and all , and for all we have □
Sinceandare positive for all, Lemma 2 implies that the signs ofandare identical. Thus ifzis an increasing function of, it is also an increasing function ofas well, and vice versa, i.e., the intervals for which z is increasing versus u is equal to the intervals for whichzis increasing versus h. This proves the direct part of the theorem.
To prove the converse part, consider that Theorem 3 implies that is real and nonnegative for some . Sincez(h) and m_{h}(h) are both real at pointh = h_{0}, it is sufficient to show that the pointh_{0 }belongs to the boundary ofD_{h}. In this case, as the functionm(h) is continuous in the complex plane (excluding few points as stated after (49)), we conclude that lim_{y → 0+} Im {m(h_{0}+iy)} = Im {m(h_{0})} = 0. To show that h_{0 }is on the boundary of D_{h}, we prove that the points in the vicinity of h_{0 }in the positive complex plane, belong to D_{h}. Let {h_{n}} be any complex sequence with positive imaginary part converging toh_{0 }as N → ∞. Sincez(h) is continuous, the sequence {z_{n}} = {z(h_{n})} exists and converges toz(h_{0}).
Lemma 3
Let z(h) be an analytic function ofh over an open set G, and h(t) ∈ Gbe a differentiable curve at t. Then if is a positive real number, we have .
Proof 5
This lemma is obtained from the Chain rule; since z(h(t)) is differentiable at t and . Thus for positive real , the argument ofand h^{ ′ }(t) are the same.□
We use Lemma 3 which implies that ifis positive and real at the pointh = h_{0 }then for any differentiable curve h(t) at t = t_{0 }where h(t_{0}) = h_{0}, i.e. the slope of the curveh(t) in the complex plane is the same as the slope of z(h(t)) ath(t) = h_{0}. Now for sufficiently large n, consider the line L_{n} = h(t) = (1 − t)h_{0} + th_{n}, 0 ≤ t ≤ 1 in which originates from h_{0}and ends at h_{n}. The transformation of L_{n}, z(L_{n}), is also a line in the positive imaginary part of complex plane with the same slope as L_{n}, as we have supposed that for the points on L_{n}. Thus the point z_{n} also lies in the positive complex plane. In the other words, for sufficiently large n, the sequence {z_{n}} lies in; hence the sequence {h_{n}} is inD_{h}. Finally, we conclude that the Stieltjes transform is defined on any such sequences and the sequence of Stieltjes transform {m(z_{n})} = {m_{h}(h_{n})} is also in for those values of n and converges to m_{h }(h_{0}) which is a real number. Thus z(h_{0}) is outside the support of eigenvalues.□
Remark 2
We must note that we use a different approach in proving Theorem 3 comparing with proof exists for the rectangular window case where the Stieltjes transformm(z) has the explicit inverse[15]. This approach is very simple and can be used in other cases where the Stieltjes transform is expressed explicitly or implicitly as a function of z.
Theorem 3 states that in order to find the support of eigenvalues, we could first find the intervals on the real line where z(h) is increasing. In a sufficiently small vicinity of these intervals on the positive imaginary part of the complex plane, it is discussed in the proof that the imaginary part of z(h) is also positive for all h in this vicinity, therefore this vicinity lies in D_{h}. Having a closer look at Figure 5, we find that approaches real axis only for some values of h which can be easily studied that these are the intervals for which z^{′}(h)>0. Thus according to this theorem the support of eigenvalues consists of three disjoint intervals for the setting of Figure 5.
Employing Theorem 3 and plotting z(h) for h < 0 one can determine the support of eigenvalues of the SCM in the asymptotic regime. The function z(h) has asymptotes at with the following onesided limits
Figure 6 shows a typical representation of the support of eigenvalues in the signal plus noise case when c_{0} = 0.1 and the covariance matrix has four distinct eigenvalues 5, 3, 2, 1 with multiplicities α_{1} = α_{2} = α_{3} = 0.1 and α_{4} = 0.7. It can be studied that in general, z(h)→ + ∞ as h→0^{−} and z(h)→0^{ + } as h→−∞ and also analogous with the rectangular window case [38] the number of extrema of z(h) (counting the multiplicities) is even and are the solutions of . Generally, in order to determine the support of eigenvalues, we identify all intervals on the vertical axis where z(h) is increasing and in general case denote them by . Removing these intervals from , what is left is S_{F} and according to the proof of Theorem 3. these intervals will not overlap each other. To see this, we note for each , there is a unique h∈D_{h}, such that x=z(h). Assume that , b ∈ {1,…,s} are the subintervals in the h domain where z(h) is increasing. Therefore, uniquely determines , which is an interval in . The complement of these intervals are the points determine the support of eigenvalues. It can be seen in Figure 6 that the support of the distribution is the union of four clusters where each of them represents the support of the distribution of only one of the eigenvalues. This is analogous with the results proven in the literature for rectangular windows [38], i.e., in this case all eigenvalues are separable on the vertical axis.
Figure 6. Support of eigenvalues in the signal plus noise case using exponential window with c_{0} = 0.1 for four distinct eigenvalues λ_{4} = 5,λ_{3} = 3,λ_{2} = 2,λ_{1} = 1 with multiplicitiesα_{1} = α_{2} = α_{3} = 0.1 andα_{4} = 0.7.
Figure 7 illustrates the same curves for c_{0} = 0.4, i.e., the forgetting factor p is reduced compared with Figure 6. We observe that the smaller the forgetting factor of the exponential window the larger the width of the subintervals associated to distinct eigenvalues. In some cases, some of adjacent subintervals may overlap, e.g. in Figure 7, the support associated to λ_{4} = 5 and λ_{3} = 3 have overlap whereas the two smaller ones are separable. Figure 8 shows , domain of h in the complex plane, using the same setting as in Figure 7. It has been shown that D_{h} approaches real axis only for the values of h for which z^{′}(h)>0 in Figure 7 which identifies the regions on the real axis outside of the support of eigenvalues. We observe that D_{h} has no intersection with the real axis between which reveals that the subintervals of support associated with three smallest eigenvalues λ=1,2,3, are not disjoint.
Figure 7. Support of eigenvalues in the signal plus noise case using exponential window with c_{0} = 0.4 for four distinct eigenvaluesλ_{4} = 5, λ_{3} = 3,λ_{2} = 2,λ_{1} = 1 with multiplicitiesα_{1} = α_{2} = α_{3} = 0.1 andα_{4} = 0.7.
Figure 8. The range of the function h(z) for, for the case wherec_{0} = 0.4 for four distinct eigenvaluesλ_{4} = 5,λ_{3} = 3, λ_{2} = 2, λ_{1} = 1 with multiplicities α_{1} = α_{2} = α_{3} = 0.1 and α_{4} = 0.7.
Figure 9, demonstrates the support of l.s.d. of SCM identified using Theorem 3 for c_{0} ∈ {0.1,0.3} and λ_{2} ∈ [1,4] with multiplicity of α_{2} = 0.1 and λ_{1} = 1 with multiplicity of α_{1} = 0.9. We observe that for large values of λ_{2}, the support associated with two eigenvalues are disjoint intervals. However, these two disjoint intervals become connected as the distance between λ_{2} and λ_{1} reduces. In practice, the value of c_{0} determines the window shape and has an important impact on the width of these intervals and on the location of the breakpoint. The location of breakpoint determines the capability of the window to identify two distinct eigenvalues. Figure 9 illustrates that the larger the value of c_{0}, the smaller the breakpoint of the support, i.e., by increasing p, we may be able to separate closer eigenvalues.
Figure 9. Support of eigenvalues of the exponentially weighted windowed data forc_{0} = 0.25, λ_{1} = 1 and λ_{2}∈ [1,4].
5.2 Limiting spectral distribution
In the noise only case, we find an explicit equation for the l.s.d. of the exponentially weighted SCM employing Lambert W function. However in the signal plus noise case, the l.s.d. can not be obtained explicitly and should be calculated numerically using (5) and (47). It is the same as the rectangular window case where the l.s.d of noise only data has M–P distribution, however there is no explicit equation for the signal plus noise case.
To find the imaginary part of the Stieltjes transform, one could alternatively find the complex roots with positive imaginary part of the inverse function z(m) for all z in the support of the eigenvalues, i.e., z ∈ S_{F}. Since the imaginary parts of m(z) and h(z) have the same sign and there is no explicit expression for z(m), we find the complex roots of z(h) using (47) and (48) for any real x_{h} = z(h)∈S_{F}, where Re {h}∈(h_{b−},h_{b+}), b∈{1,…,s}. This can be done by finding ν = Im{h} for which Im {z(h)} = 0. By inserting the calculated h in (49), we obtain the Stieltjes transform for x_{h}∈S_{F}. Finally F^{R}(x) is obtained using (5). According to Proposition 1, for any there exists a unique h satisfying (47) and (48), thus the above procedure results in the desired value of h and m.
6 Simulation results
In Figure 10, we plot the density functions and a histogram to show the accuracy of the derived l.s.d.’s in this article for an array with a finite dimension M = 20 and an exponential window with p = 0.975. In this case we have c_{0} = −M ln(p) = 0.5. In addition, in all our simulations, we used γ = 10^{−8}; thus according to the definition of γ in the truncated exponential window, we have and the truncated exponential window accurately describes the exponential window. In this case, the histogram of the eigenvalues is generated by 2,000 samples of SCMs, each computed from 2,000 independent data sets, where each data set consists of N independent random vectors of length M. Using the forgetting factor of the exponential window, p, the SCM is generated using . Then using the eigenvalues of all of these SCMs the histogram of the eigenvalues of SCMs is generated. It can be observed that the histogram of the eigenvalues accurately fits the derived l.s.d. of the exponentially weighted windowed data in (27). This figure also shows results of the method in [34]. We observe that these results approximately fits the simulated data. As mentioned before, this method uses numerical calculations rather than a closed form expression. The Wishart approximation (for the effective length of window (15)) is also plotted in this figure which has a similar shape with small deviation from the histogram. As mentioned before, in some array signal processing applications, the effective length of the exponential window has been considered to be [22,23]. To evaluate the accuracy of this approximation, the M–P density function using this effective length is also plotted which shows a larger deviation from the simulated data.
Figure 10. Distribution of eigenvalues using the exponential window for M = 20 and p = 0.975 .
In Figure 11, the l.s.d of exponentially windowed data is plotted for different values of p ∈ {0.95,0.97,0.98,0.99,0.995} and M = 20. We observe that as p tends to one, the eigenvalues become more concentrated around their true values. This is because the effective length of the window increases as p approaches 1.
Figure 11. Distribution of eigenvalues using the exponential window for M = 20 and p ∈ {0.95,0.97,0.98,0.99,0.995} .
Figure 12 shows the spectral distribution for an exponentially windowed SCM in a case where the eigenvalues are 12, 7, 3,1 with the same multiplicity ratios for two values of c_{0} = 0.1 and c_{0} = 0.4. It can be seen that as c_{0} decreases (i.e., as the forgetting factor p increases for a fixed value of array dimension M) the spectral distribution tends to concentrate around the true eigenvalues. Figure 12 shows that the supports corresponding to eigenvalues λ_{4} = 12, λ_{3} = 7 are not disjoint for c_{0} = 0.4 where as they are separate for c_{0} = 0.1. In this figure, the empirical distributions are obtained using simulation data with M = 20, and and the l.s.d. are numerically calculated as introduced in the previous section. In this case the multiplicities of all of the eigenvalues of the covariance matrix is 5. We see that the l.s.d. fit the empirical results even for moderate and small array dimensions.
Figure 12. Distribution of eigenvalues of exponentially windowed data forc_{0} = 0.1 andc_{0} = 0.4 where the covariance matrix has 4 distinct eigenvalues 12 , 7 , 3 , 1 with the same multiplicity ratios.
7 Conclusion
In this article the l.s.d. of SCM in the case of weighted windowed data has been studied. Defining the effective length of a window, we have approximated the distribution of the eigenvalues in the weighted window case with that of a Wishart matrix, when the number of samples are much more than array dimension. Also the connectivity condition for coefficients of the window has been developed to avoid fragmentation of the support of eigenvalues in the noise only data. For the exponential window, we have derived an exact expression for the l.s.d. of SCM which has excellent agreement with the simulation results. We have also introduced a way to analyze the support and distribution of eigenvalues in the signal plus noise data cases. The results of this work could be used in design and improvement of detectors and estimators based on weighted windowed data especially when an exponential window is employed.
Endnotes
^{b}The Lambert W function [33], ω (x) is also called the Omega function and is the solution of ωe^{ω} = z for any complex number z. This equation is not injective, thus the function ω(z) is multivalued and has a set of different branches named ω_{k}(z) for any integer k. For real values of z, there exist two real valued branches of Lambert W function ω_{0} (z) and ω_{−1} (z) which take on real values for and complex values, otherwise. The function ω_{0}(z) is referred to as the principal branch of the Lambert W function and shown by ω(z) for simplicity.
Competing interests
The authors declare that they have no competing interests.
Acknowledgments
This work is supported in part by Iran Telecommunication Research Center (ITRC).
References

J Silverstein, A Tulino, Theory of large dimensional random matrices for engineers. IEEE Ninth International Symposium on Spread Spectrum Techniques and Applications (ManausAmazon, Brazil, 2006), pp. 458–464

E Yazdian, S Gazor, M Bastani, Source enumeration in large arrays using moments of eigenvalues and relatively few samples. IET Signal Process 6(7), 689–696 (2012). Publisher Full Text

A Tadaion, M Derakhtian, S Gazor, M Aref, A fast multiplesource detection and localization array signal processing algorithm using the spatial filtering and ML approach. IEEE Trans. Signal Process 55(5), 1815–1827 (2007)

S Affes, S Gazor, Y Grenier, Robust adaptive beamforming via LMSlike target tracking. IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP94 (Adelaide, Australia, 1994), pp. 269–269

S Gazor, S Affes, Y Grenier, Robust adaptive beamforming via target tracking. IEEE Trans. Signal Process 44(6), 1589–1593 (1996). Publisher Full Text

S Gazor, R Far, Adaptive maximum windowed likelihood multicomponent AMFM signal decomposition. IEEE Trans. Audio Speech Lang. Process 14(2), 479–491 (2006)

S Gazor, K Shahtalebi, A new NLMS algorithm for slow noise magnitude variation. IEEE Signal Process. Lett 9(11), 348–351 (2002)

S Kritchman, B Nadler, Nonparametric detection of the number of signals: hypothesis testing and random matrix theory. IEEE Trans. Signal Process 57(10), 3930–3941 (2009)

X Mestre, On the asymptotic behavior of the sample estimates of eigenvalues and eigenvectors of covariance matrices. IEEE Trans. Signal Process 56(11), 5353–5368 (2008)

R Couillet, J Silverstein, Z Bai, M Debbah, Eigeninference for energy estimation of multiple sources. IEEE Trans. Inf. Theory 57(4), 2420–2439 (2011)

J Wishart, The generalised product moment distribution in samples from a normal multivariate population. Biometrika 20(1/2), 32–52 (1928)

A James, Distributions of matrix variates and latent roots derived from normal samples. Ann. Math. Stat 35(2), 475–501 (1964). Publisher Full Text

M Chiani, M Win, A Zanella, On the capacity of spatially correlated MIMO Rayleighfading channels. IEEE Trans. Inf. Theory 49(10), 2363–2371 (2003). Publisher Full Text

V Marčenko, L Pastur, Distribution of eigenvalues for some sets of random matrices. Math. USSRSbornik 1, 457 (1967). Publisher Full Text

J Silverstein, S Choi, Analysis of the limiting spectral distribution of large dimensional random matrices. J. Multivar. Anal 54(2), 295–309 (1995). Publisher Full Text

R Dozier, J Silverstein, Analysis of the limiting spectral distribution of large dimensional informationplusnoise type matrices. J. Multivar. Anal 98(6), 1099–1122 (2007). Publisher Full Text

J Baik, J Silverstein, Eigenvalues of large sample covariance matrices of spiked population models. J. Multivar. Anal 97(6), 1382–1408 (2006). Publisher Full Text

N El Karoui, Tracy–Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices. Ann. Prob 35(2), 663–714 (2007). Publisher Full Text

J Baik, G Ben Arous, S Péché, Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann. Prob 33(5), 1643–1697 (2005). Publisher Full Text

G Ganesan, Y Li, Cooperative spectrum sensing in cognitive radio, part I: two user networks. IEEE Trans. Wirel. Commun 6(6), 2204–2213 (2007)

G Ganesan, Y Li, Agility improvement through cooperative diversity in cognitive radio. in Global Telecommunications Conference, GLOBECOM’05, ed. by . IEEE (St. Louis, MO, 2005), pp. 5–5 PubMed Abstract

B Champagne, Adaptive eigendecomposition of data covariance matrices based on firstorder perturbations. IEEE Trans. Signal Process 42(10), 2758–2770 (1994). Publisher Full Text

S Valaee, P Kabal, An information theoretic approach to source enumeration in array signal processing. IEEE Trans. Signal Process 52(5), 1171–1178 (2004). Publisher Full Text

S Ouyang, Y Hua, Biiterative leastsquare method for subspace tracking. IEEE Trans. Signal Process 53(8), 2984–2996 (2005)

X Doukopoulos, G Moustakides, Fast and stable subspace tracking. IEEE Trans. Signal Process 56(4), 1452–1465 (2008)

Z Burda, J Jurkiewicz, B Wacław, Spectral moments of correlated Wishart matrices. Phys. Rev. E Stat. Nonlinear Soft Mat. Phys 71(2 Pt 2), 026111 (2005)

P Forrester, Eigenvalue distributions for some correlated complex sample covariance matrices. J. Phys. A Math. Theor 40 (2007)

L Zhang, Spectral analysis of large dimensional random matrices, Ph.D. Thesis

D Paul, J Silverstein, No eigenvalues outside the support of the limiting empirical spectral distribution of a separable covariance matrix. J. Multivar. Anal 100, 37–57 (2009). Publisher Full Text

H Zhang, S Jin, X Zhang, D Yang, On marginal distributions of the ordered eigenvalues of certain random matrices. EURASIP J. Adv. Signal Process 2010, 67 (2010)

E Yazdian, MH Bastani, S Gazor, Spectral distribution of the exponentially windowed sample covariance matrix. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Kyoto, Japan, 2012), pp. 3529–3532

Z Bai, J Silverstein, Spectral Analysis of Large Dimensional Random Matrices (: Springer, 2010)

R Corless, G Gonnet, D Hare, D Jeffrey, D Knuth, On the Lambert W function. Adv. Comput. Math 5, 329–359 (1996). Publisher Full Text

S Pafka, M Potters, I Kondor, Exponential weighting and randommatrixtheorybased filtering of financial covariance matrices for portfolio optimization ((available at http://arxiv, 2004), . org/abs/condmat/0402573 webcite)

D DE LACHAPELLE, Modern portfolio theory revisited: from real traders to new methods PhD thesis, ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE

J Daly, M Crane, H Ruskin, Random matrix theory filters in portfolio optimisation: a stability and risk assessment. Phys. A Stat. Mech. Appl 387(16), 4248–4260 (2008). Publisher Full Text

M Potters, J Bouchaud, L Laloux, Financial applications of random matrix theory: old laces and new pieces. Acta Physica Polonica B 36, 2767 (2005)

Z Bai, J Silverstein, Exact separation of eigenvalues of large dimensional sample covariance matrices. Ann. Prob 27(3), 1536–1555 (1999). Publisher Full Text