In this post, I am going to motivate an intuitive derivation of the Sherman-Morrison formula using Schur complements.

Recall the following fact (see also this Wikipedia entry on Schur complements).

Lemma 1. Let $M\in{\mathbb{R}}^{(n+m)\times (n+m)}$ be a block matrix with blocks \begin{aligned} M = \begin{pmatrix} A & B\\ C & D \end{pmatrix} .\end{aligned} If $A$ is invertible, then there exist invertible matrices $P_1$, $Q_1\in{\mathbb{R}}^{n\times n}$ such that \begin{aligned} P_1 M Q_1 = \begin{pmatrix} A - BD^{-1}C&\\& D \end{pmatrix}.\end{aligned} Similarly, if $D$ is invertible, then there exist invertible matrices $P_2$, $Q_2\in{\mathbb{R}}^{n\times n}$ such that \begin{aligned} P_2 M Q_2 = \begin{pmatrix} A &\\&D - CA^{-1}B \end{pmatrix}.\end{aligned}

For the purposes of this post, we won’t need to know what $P_i$ and $Q_i$ look like. It is enough to know that these matrices and their inverses have simple forms.

We will apply Lemma 1 to the matrix \begin{aligned} \begin{pmatrix} A & b\\ c^\intercal & 1 \end{pmatrix},\end{aligned} where $A\in{\mathbb{R}}^{n\times n}$ is invertible and $b$, $c\in{\mathbb{R}}^n$. For notational convenience, let $P = P_1P_2^{-1}$ and $Q = Q_2^{-1}Q_1$. We have that \begin{aligned} \begin{pmatrix} A - bc^\intercal & \\ & 1 \end{pmatrix} = P \begin{pmatrix} A & \\ & 1 - c^\intercal A^{-1}b \end{pmatrix}Q.\end{aligned} This identity tells us that $A-bc^\intercal$ is invertible if and only if $1 - c^\intercal A^{-1}b$ is invertible, i.e., nonzero. Finally, inverting both sides of the equation, we have that \begin{aligned} \begin{pmatrix} (A-bc^\intercal)^{-1} & \\ & 1 \end{pmatrix} = Q^{-1}\begin{pmatrix} A^{-1} &\\ & \tfrac{1}{1- c^\intercal A^{-1}b} \end{pmatrix}P^{-1}.\end{aligned} In particular, the top left block of the matrix on the right is the inverse of $A - bc^\intercal$.

Looking back at the derivation, the main idea was to use a change of variables (the $P_i$, $Q_i$ matrices) to “shift” the inversion of $A - bc^\intercal$, an a priori difficult task, to the inversion of the scalar $1 - c^\intercal A^{-1}b$, a much simpler task. This generalizes in a number of ways. For example, the same derivation applied to the $(n+m)\times(n+m)$ dimensional block matrix \begin{aligned} \begin{pmatrix} A & B\\ C^\intercal & I_m \end{pmatrix},\end{aligned} where $A\in{\mathbb{R}}^{n\times n}$ is invertible and $B$, $C\in{\mathbb{R}}^{n\times m}$, gives a formula for the inverse of $A - BC^\intercal$ in terms of the inverses of $A$ and $I - C^\intercal A^{-1}B$. This latter formula is known as the Woodbury matrix identity.

 Last updated Feb 28, 2021