What’s So Important About Covariance Matrices, Anyway?

I’ve often been interested in why covariance matrices are important to so many different fields of mathematics. I mean, I know they are important for knowing covariances between different features, but why are they useful beyond that? Specifically, if I’m looking at some time series data from different detectors, how can they help me find the secrets hidden within?

Let’s firstly define a covariance matrix. If X is a n\times m matrix with n time series each containing m observations, then the covariance matrix is given by \frac{1}{m-1}X^{T}X. The elements on the diagonal are given by \frac{1}{m-1}\sum_{m} x_{i}(m)^{2}, which is the sample second moment of the distribution. If we center this – we get the variance. Similarly, on the off-diagonal we get \frac{1}{m-1}\sum_{m} x_{i}(m)x_{j}(m), which looks very much like the covariance between different time series.

The first reason we like a matrix like X^{T}X is that it is guaranteed to be square and symmetric. Being square, we can use things like eigenvector decomposition to analyse this matrix.

Another important construction is the singular value decomposition. This states that any n\times m matrix X can be decomposed into three parts, U, S, V such that X = USV^{T}. The matrices U and V are unitary – meaning that their conjugate transpose is also their inverse. If they are real matrices then it means that their transpose is their inverse, so U^{T} U = U U^{T} = I.

We can see that these two decompositions are related! If we set X = USV^{T} then

    \[X^{T}X = V S^{T} U^{T}U S V^{T}= VS^{T} S V^{T}= VS^{2}V^{T}\]

.

The matrix S^{2} is diagonal, so this gives an eigendecomposition of the array covariance matrix X^{T}X! The eigenvalues are the squared singular values, and the eigenvectors are the V from the singular value decomposition.

So why would you choose to do one over the other? They both give a decomposition of the matrix – and sometimes people even do a SVD of the covariance matrix!

The answer appears to be in the stability of the algorithm. Small changes in the numbers can make a big difference in the eigenvalues / eigenvectors of the matrix. This can be a problem with floating point numbers, for example. The SVD tends to be more stable in its results.

In the next few articles, we’re going to see why these decompositions are helpful to us by applying them to the problem of observing astronomical signals with an array of receivers in the presence of radio frequency interference. We’re going to see that these decompositions are extremely powerful in stripping out common interference signals and recovering the signals buried beneath!


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *