What You Came Here For

So, you googled Positive Semidefinite, hoping for someone to explain to you what a positive semidefinite matrix is. Maybe you are interested in doing some Principle Component Analysis, or maybe you want to solve a linear system with a Cholesky decomposition.
Well you came to the right place!

Positive Semidefinite

The positive in positive semidefinite is telling: the property of being positive semidefinite is a generalization of the property of being positive in real numbers to higher dimensional matrices. To further suggest this, we often denote a positive semidefinite matrix $M$ as \[ M \ge 0 \] and we say that \[ M \ge N \] if \[ M - N \ge 0 \] This partial order makes the ring of matrices into an ordered ring, i.e. we can show that if $M \ge N$, then for any $A$, \[ M+A \ge N + A \] and if $A \ge 0$, \[ AM \ge AN \]
TL;DR Positive semidefinite $\sim$ positive.

Bilinear Maps

One way to define positive semidefinite matrices is through their associated bilinear maps:
A bilinear map on a vector space $V$ over a field $K$ is a map \[ \langle\cdot, \cdot\rangle: V \times V \rightarrow K \] so that for every $v$, the maps \[ \langle \cdot, v\rangle \text{ and } \langle v, \cdot \rangle \] are both linear. Another way to look at a bilinear map is as a linear map from $V$ into the dual space $V^*$ of maps from $V$ to $K$ \[ f : V \rightarrow V^* \] where $f(v) = f_v$ is defined as a map in $V^*$ which maps $w$ to $f_v(w)= \langle v, w\rangle$. (If you are familiar with some computer science, this is the operation of currying the first element of the map.)

TL;DR A bilinear map as a linear map between every vectors and maps from vectors to the base field.

For a finite dimensional vector space, $V$ and $V^*$ are isomorphic, so by choosing a basis for each of them, we can identify each map from $V$ to $V^*$ with a square matrix $M$. In this description, a bilinear map is given by \[ vMw^{\intercal} = v\cdot (Mw^{\intercal}) \] where $\cdot$ is the usual dot product with respect to a given basis $\beta = \{\beta_i\}$, where if $v = \sum_i v_i\beta_i$ and $w = \sum_i w_i \beta_i$, then $v\cdot w = \sum_i w_i v_i$. (This can be derived from the definition of the dual basis).
Dually, given any matrix, we can define a bilinear map through the above definition.

TL;DR Bilinear maps and square matrices are isomorphic.

Inner Products

So far, this discussion has been entirely algebraic: these definitions hold over arbitrary fields, and can be entirely expressed through linear maps. To take a geometric point of view, we need to make a different definition. If we have a vector space over $\mathbb{R}$, then we can define an inner product as follows: A bilinear map $\langle \cdot, \cdot \rangle$ is an inner product if the following properties hold
Symmetry
$\langle v, w \rangle = \langle w, v \rangle$
Positive-Semidefiniteness
$\langle v, v \rangle \ge 0$, with $\langle v, v \rangle = 0$ iff $v=0$
An inner product is essentially a generalized way of measuring angles and lengths in a general vector space (see the formula $v\cdot w = |v||w|\cos(\theta)$ for the standard dot product).
Now, using the algebraic formalism, we know that associated to any inner product, there is a matrix $M$ such that $\langle \cdot, \cdot \rangle = vMw^{\intercal}$. The symmetric condition of an inner product corresponds to the symmetry of the matrix $M$, that is to say that $M_{ij} = M_{ji}$.

Perhaps not suprisingly, we define $M$ to be positive semidefinite if its associated bilinear map has the positive semidefiniteness condition.

TL;DR a square matrix $M$ is positive semidefinite iff \[ \forall v \in V,\; vMv^{\intercal} \ge 0 \]

Note
In this article, we only discuss matrices over $\mathbb{R}$, but more generally, we define an inner product over $\mathbb{C}$ as a map which is linear in its first term, and satisfying
Hermitian symmetry
$\langle v, w \rangle = \overline{\langle w, v \rangle}$
and positive-semidefiniteness. The Hermitian symmetry condition corresponds to $M_{ij} = \overline{M_{ji}}$ in the matrix.
For the rest of this article, we'll assume that these matrices are symmetric.

Eigenvalue characterization

It turns out that looking at the eigenvalues of positive semidefinite matrices give us a way of characterizing them. If we let $v$ be an eigenvalue of $M$, i.e. $Mv = \lambda v$, for some $\lambda$, then we can look at \[ v M v^{\intercal} = v \cdot (\lambda v) = \lambda (v \cdot v) \] Now, using the fact that the standard inner product satisfies the property that $v \cdot v \ge 0$ (you can see this from the fact that it is actually the sum of squares, which are all non-negative), we see that $vMv^{\intercal}$ has the same sign as $\lambda$. Therefore, we have the following result:

If $M$ has a basis of eigenvectors, then $M$ is positive semidefinite iff all of its eigenvalues are non-negative

We given an explicit proof: If $M$ is positive semidefinite, then in particular, we require that for all eigenvectors, \[ v M v^{\intercal} = \lambda (v \cdot v) \ge 0 \] which implies that $\lambda \ge 0$.
Now, if $M$ has the property that all of its eigenvalues are nonnegative, then for any $u, w$, we use the standard dot product defined on the basis of eigenvectors: $\beta = \{b_i\}$ to see that \[ u M u^{\intercal} = \sum_i \sum_j u_iu_j \lambda_i (v_i \cdot v_j) \] Now, for symmetric matrices we have that their eigenvalues are orthogonal, so that $v_i \cdot v_j = 0$ if $i \neq j$. This summation becomes \[ \sum_i u_i^2 \lambda_i \] which is nonnegative if all of the $\lambda_i$ are nonnegative, as the the squares of $u_i$ are positive.
TL;DR In linear algebra, we can generalize concepts about real numbers to concepts about matrices by simply applying those concepts to the eigenvalues of the matrices.Positive Semidefiniteness is just the generalization of non-negativity of the eigenvalues.

No comments:

Powered by Blogger.