In this post, I am going to motivate an intuitive derivation of the Sherman-Morrison formula using Schur complements.

Recall the following fact (see also this Wikipedia entry on Schur complements).

**Lemma 1**. *Let $M\in{\mathbb{R}}^{(n+m)\times (n+m)}$ be a block
matrix with blocks $\begin{aligned}
M = \begin{pmatrix}
A & B\\
C & D
\end{pmatrix} .\end{aligned}$ If $A$ is invertible, then there exist
invertible matrices $P_1$, $Q_1\in{\mathbb{R}}^{n\times n}$ such that
$\begin{aligned}
P_1 M Q_1 = \begin{pmatrix}
A - BD^{-1}C&\\& D
\end{pmatrix}.\end{aligned}$ Similarly, if $D$ is invertible, then
there exist invertible matrices $P_2$, $Q_2\in{\mathbb{R}}^{n\times n}$
such that $\begin{aligned}
P_2 M Q_2 = \begin{pmatrix}
A &\\&D - CA^{-1}B
\end{pmatrix}.\end{aligned}$*

For the purposes of this post, we won’t need to know what $P_i$ and
$Q_i$ look like. It is enough to know that these matrices *and their
inverses* have simple forms.

We will apply Lemma 1 to the matrix $\begin{aligned} \begin{pmatrix} A & b\\ c^\intercal & 1 \end{pmatrix},\end{aligned}$ where $A\in{\mathbb{R}}^{n\times n}$ is invertible and $b$, $c\in{\mathbb{R}}^n$. For notational convenience, let $P = P_1P_2^{-1}$ and $Q = Q_2^{-1}Q_1$. We have that $\begin{aligned} \begin{pmatrix} A - bc^\intercal & \\ & 1 \end{pmatrix} = P \begin{pmatrix} A & \\ & 1 - c^\intercal A^{-1}b \end{pmatrix}Q.\end{aligned}$ This identity tells us that $A-bc^\intercal$ is invertible if and only if $1 - c^\intercal A^{-1}b$ is invertible, i.e., nonzero. Finally, inverting both sides of the equation, we have that $\begin{aligned} \begin{pmatrix} (A-bc^\intercal)^{-1} & \\ & 1 \end{pmatrix} = Q^{-1}\begin{pmatrix} A^{-1} &\\ & \tfrac{1}{1- c^\intercal A^{-1}b} \end{pmatrix}P^{-1}.\end{aligned}$ In particular, the top left block of the matrix on the right is the inverse of $A - bc^\intercal$.

Looking back at the derivation, the main idea was to use a change of
variables (the $P_i$, $Q_i$ matrices) to “shift” the inversion of
$A - bc^\intercal$, an *a priori* difficult task, to the inversion of
the scalar $1 - c^\intercal A^{-1}b$, a much simpler task. This
generalizes in a number of ways. For example, the same derivation
applied to the $(n+m)\times(n+m)$ dimensional block matrix
$\begin{aligned}
\begin{pmatrix}
A & B\\
C^\intercal & I_m
\end{pmatrix},\end{aligned}$ where $A\in{\mathbb{R}}^{n\times n}$ is
invertible and $B$, $C\in{\mathbb{R}}^{n\times m}$, gives a formula for
the inverse of $A - BC^\intercal$ in terms of the inverses of $A$ and
$I - C^\intercal A^{-1}B$. This latter formula is known as the Woodbury
matrix identity.