An idea for calculating powers of matrices

I wanted to share with you a mathematical trick. Nothing too difficult, just not the first thing that someone would think. One of the important things in mathematics is to try to use the same tools for different jobs. Be creative!

We will use the Caley – Hamilton theorem to calculate arbitrary powers of a n\times n matrix A. But first, let’s remember what the Caley – Hamilton theorem says.

Caley – Hamilton theorem :

Let A be a square, n\times n matrix. Calculate it’s characteristic polynomial, P(\lambda)=|A-\lambda\cdot I|. Then P(A)=0.

In other words, the matrix A verifies it’s characteristic polynomial, a non-trivial result. But why should that be useful in calculating matrix powers? And why shouldn’t we just do  the calculation directly?

Let’s count the multiplications needed to calculate A^2. So, we have n rows consisting of n elements each and we have to multiply each one of those rows with all of the n columns of A. So, in total we have to do n^3 multiplications. How does that scale with n? Let’s ask Wolfram.

That’s that. And those results are only valid if we want to calculate A^2. If we want A^k,\ k=3,4,5,\ldots, we have n^3\cdot (k-1) and things start to get out of hand.

So you may wonder . . . since multiplying two matrices is so straightforward, how can we do better than that? Well, matrix multiplication is different than number multiplication in many things and it seems that we can do some stuff to make life easier. Let’s have the general 2\times 2 matrix A.

A=\left (\begin{array}{rl}a& b\\c& d\end{array}\right)

where a, b, c and d are real constants. We move on to calculate A’s characteristic polynomial.

|A-\lambda\cdot I|=\left |\begin{array}{rl}a-\lambda& b\\c& d-\lambda\end{array}\right|=\lambda^2-(a+d)\lambda-cb+ad

which by the previous theorem, we know what the result will be if we change \lambda with A.

A^{2}-(a+d)A-(cb-ad)\cdot I_{n}=0\Rightarrow A^{2}=(a+d)A+(cb-ad)\cdot I_{n}

So now, by using this theorem, we got back a formula for A^{2} which involves only 2 multiplications, one addition and one subtraction. Not too shabby!

But the fun doesn’t end here. Since we know that this formula for A^2 holds, we can multiply the whole equation with A and then we will also have a formula for A^{3}, consisting of lower powers of A. But because we already have a formula for all those powers in terms of A, we can express A^{3} as a sum of a multiple of A and a multiple of I.

UPDATE : 7 august 2012

As it turns out, I was overly ambitious and confident in my back of the envelope calculation and of course my general formula is wrong, meaning that it doesn’t follow that pattern. It’s actually very easy to see that and I thank Vahid Ranjbar for the heads up. 🙂

But what about our general formula? Does it exist? In my opinion, there does not seem to be any general pattern that could make the formula “short and beautiful” as I hoped.

Here’s some mathematica code to replicate what I am about to show you.


y = trA*A + detA;

(*Make sure that you input the above code in a different cell from the one below.*)
y = Simplify[Expand[y*A] /. A^2 -> y]

So, even if you don’t speak mathematica, you should have got by now that this code generates the formula for A^n if you run it n-1 times. So, here’s what we get  for different values of n.

A^3 =A \left(\text{detA}+\text{trA}^2\right)+\text{detA} \text{trA}

A^4=A \left(\text{detA}^2+\text{detA} (2 \text{trA}+1)\text{trA}+\text{trA}^4\right)+\text{detA} \text{trA}\left(\text{detA}+\text{trA}^2\right)

And we keep on getting more and more terms. The formula for A^5 has 24 and the formula for A^6 73. So while feasible for small powers of A, we can forget it for large ones.



    1. You are right of course. Thanks for the heads up!

      It turns out that there is no such easy formula (expressing the power of the matrix to powers of the trace and the determinant) that works for every possible n and it gets messy with higher powers. I wonder why I didn’t immediately try the formula to see if it’s correct.

      Anyway, sorry if you visited my blog with hopes of using such a result. To my knowledge, such a formula doesn’t exist.

    1. Sure, that is known. 🙂

      The search for another way started from thinking about this and lower rank approximations.

Comments are closed.