__Solving linear ODEs with repeated eigenvalues ____correctly.__

Let's start with the relationship between eigenvectors and solutions of linear ODEs. Consider the equation

\[

\dot x = Ax \quad x(0) = x_0 \in \mathbb{R}^n

\]

and assume first that $A$ is diagonalizable. That means if $\{v_1,\dotsc, v_n\}$ are the eigenvectors of $A$, then they form a basis. The solution of this equation is given simply by $x(t) = \exp(tA)x_0$ where $\exp(tA)$ is the matrix exponential defined by the formula

\[

\exp(tA)x_0 = \sum_{k = 0}^\infty \frac{t^k A^k}{k!}.

\]

I'll stick to using $\exp(*)$ for matrix exponentiation and $e^{*}$ for scalars so it is always obvious what kind of exponential is meant.

The reason we write the solution this way boils down to the following computation. Let's look at what happens if we choose $x_0$ to be an eigenvector of $A$, say $x_0 = v_i$. Then this formula says that the solution is

\[

x(t) = \exp(tA)v_i = \sum_{k = 0}^\infty \frac{t^k A^k}{k!}v_i = \sum_{k = 0}^\infty \frac{t^k \lambda_i^k}{k!}v_i = e^{\lambda_i t} v_i

\]

where $\lambda_i$ is the eigenvalue for $v_i$. This computation says that the solutions which starts in the span of an eigenvector remains in the span of that eigenvector for all $t \in \mathbb{R}$. This is what makes eigenvectors special (at least in this context, they are special for many reasons). Eigenvectors are the “straight line” solutions of the ODE.

Now, remove the assumption that $A$ has a basis of eigenvectors. Let's assume $A$ is $2 \times 2$ with repeated eigenvalue $\lambda$, but $A$ has only a single eigenvector for $\lambda$, say $v_1$. It turns out this is the simplest case. but it actually settles the entire problem for all dimensions via a trivial generalization. To rephrase the situation, we have $ \operatorname{ker}(A - \lambda I) = \operatorname{span}(\{v_1\})$ which raises the important question, where does $A - \lambda I $ map the rest of the vectors in $\mathbb{R}^2$?

It turns out (a consequence of the Cayley-Hamilton theorem) that in this case there **must** exist a $v_2 \notin \operatorname{span}(\{v_1\})$ which satisfies the equation $\left(A - \lambda I\right)^2 v_2 = 0$. But that means the vector $\left(A - \lambda I\right)v_2 \in \operatorname{ker}(A) = \operatorname{span}(\{v_1\})$ so after rescaling appropriately, we can simply assume that $v_1 = \left(A - \lambda I\right)v_2$. The vector, $v_2$ is called a *generalized *eigenvector for $A$. Now, let's look at what this means for the ODE.

To start, let's revisit the computation of the solution for an eigenvector. Only this time we rewrite the condition of being an eigenvector in the form $\left(A - \lambda I\right) v_1 = 0$ so then taking the matrix exponential, the same computation is equivalent to

\[

\exp\left(t \left(A - \lambda I\right)\right) v_1 = \sum_{k = 0}^\infty \frac{t^k \left(A - \lambda I\right)^k}{k!} v_1 = v_1

\]

where we used the fact that $\left(A - \lambda I\right)^kv_1 = 0$ for all $k \geq 1$. It turns out that if two matrices, $A$ and $B$, commute with one another, then the matrix exponential satisfies the nice property as in the scalar case: $\exp{AB} = \exp{A}\exp{B}$. Since the identity commutes with any matrix, we apply this one the left of the previous equation to get

\[

\exp\left(t \left(A - \lambda I\right)\right) v_1 = \exp \left(tA\right) \exp \left(-\lambda t I\right)v_1 = e^{-\lambda t} \exp \left(t A\right)v_1.

\]

Combining these equations, we arrive at the identity

\[

e^{-\lambda t} \exp \left(t A\right)v_1 = v_1 \implies \exp \left(tA\right)v_1 = e^{\lambda t} v_1

\]

which is just the same identity we already knew.

However, doing the same computation for $v_2$ now reveals where the mysterious $t$ arises from. Start with

\[

\exp\left(t \left(A - \lambda I\right)\right) v_2 = \sum_{k = 0}^\infty \frac{t^k \left(A - \lambda I\right)^k}{k!} v_2 = v_2 + t \left(A - \lambda I\right)v_2

\]

where the additional term comes from the fact that now, $\left(A - \lambda I\right)^kv_2 = 0$ only if $k \geq 2$. Applying our nice multiplicative identity again gives us

\[

e^{-\lambda t} \exp \left(t A\right)v_2 = v_2 + t \left(A - \lambda I\right)v_2 \implies \exp \left(tA\right)v_2 = e^{\lambda t} v_2 + t e^{\lambda t}\left(A - \lambda I\right)v_2.

\]

But remember that $\left(A - \lambda I\right)v_2 = v_1$, so we end up with the solution

\[

\exp(tA)v_2 = e^{\lambda t} v_2 + t e^{\lambda t} v_1.

\]