Matrix Multiplication

Rules

A matrix is laid out by row and column. Menaing, a particular cell in matrix $C$ is $C_{ij}$. For example, $C_{34}$ is the 3rd row, 4th column in matrix $C$.

$\begin{bmatrix} 1,1&1,2&1,3&1,4\\ 2,1&2,2&2,3&2,4\\ 3,1&3,2&3,3&3,4\\ 4,1&4,2&4,3&4,4 \end{bmatrix}$

  1. Number of columns in matrix A must match number of rows in matrix B.
  2. The output of matrix multiplication will be dimensions equal to the number of rows in matrix A by the number of columns in matrix B.
  3. You can cut the matrix into blocks and do matrix multiplication in blocks.

$\begin{bmatrix}A_1&A_2\\A_3&A_4\end{bmatrix} \begin{bmatrix}B_1&B_2\\B_3&B_4\end{bmatrix}= \begin{bmatrix}A_1B_1+A_2B_3&A_1B_2+A_2B_4\\A_3B_1+A_4B_3&A_3B_2+A_4B_4\end{bmatrix}$

Note: While cutting it into blocks may not seem useful immediately, it is crucial for high speed computation. In Deep Learning where you are multiplying large matrices together can speed up computation speed immensely by breaking them into blocks so you can fit the blocks into CPU Memory. In fact, this is exactly what pytorch does.

1st Way

Let's imagine we have a matrix multiplation

$$AB=C$$

$C_{34}=$(row 3 of A)$\cdot$(column 4 of B)

$C_{34}=a_{31}b_{14}+a_{32}b_{24}+......$

$C_{34}=\sum\limits_{k=1}^n a_{3k}b_{k4}$

2nd Way

$$AB=C$$

The second way to think about it is that matrix $A$ times by the first column of matrix $B$ will give you the first column of $C$.

Matrix $A$ times by the second column of matrix $B$ will give you the second column of $C$.

Now really what we are doing is thinking of columns of $C$ as combinations of columns of $A$

$A \cdot B_{n1} =C_{m1}$

3rd Way

$$AB=C$$

The third way to think about it is that a row of $A$ times matrix $B$ will give you a column of $C$.

Now really what we are doing is thinking of rows of $C$ as combinations of rows of $B$.

4th Way

If we multiply a column by a row, we get a full sized matrix. We also see that the columns are multiples of the column, and the rows and multiples of the row. This is what we expect as we just discussed above with the combinations of each other.

$\begin{bmatrix}2\\3\\4\end{bmatrix}\begin{bmatrix}1&6\end{bmatrix}=\begin{bmatrix}2&12\\3&18\\4&24\end{bmatrix}$

$AB=$Sum of (Cols of A) $\cdot$ (Rows of B)

$\begin{bmatrix}2&7\\3&8\\4&9\end{bmatrix}\begin{bmatrix}1&6\\0&0\end{bmatrix}$ $=$ $\begin{bmatrix}2\\3\\4\end{bmatrix}\begin{bmatrix}1&6\end{bmatrix}+$ $\begin{bmatrix}7\\8\\9\end{bmatrix}\begin{bmatrix}0&0\end{bmatrix}$ $=$ $\begin{bmatrix}2&12\\3&18\\4&24\end{bmatrix}$

This matrix all sit on the same line because they are just multiples.

Inverses

Not all Matrices have inverses. Most important question is whether it's invertable. If $A^-1$ exists.

$A^{-1}A = I = AA^{-1}$

Invertable, nonsingular are the good case.

Matrices with no inverse

In the singular case there is no inverse. 2x2 matrix that has no inverse.

$\begin{bmatrix}1&3\\2&6\end{bmatrix}$

Cannot get 1,0, because I can find a vector $x$ with $Ax=0$

$\begin{bmatrix}1&3\\2&6\end{bmatrix}\begin{bmatrix}3\\-1\end{bmatrix}=\begin{bmatrix}0\\0\end{bmatrix}$

But why does this matter? Because if I use these values for X anything I multiply by cannot possibly give the identify matrix, because 0 times anything gives 0.

Matrices with an inverse

$AA^{-1}=I => \begin{bmatrix}1&3\\2&7\end{bmatrix} \begin{bmatrix}a&b\\c&d\end{bmatrix}= \begin{bmatrix}1&0\\0&1\end{bmatrix}$

Now from our matrix multiplication work above we know that $A \cdot$column $j$ of $A^{-1} = $column $j$ of $I$

So we really have a system of 2 equations to solve.

$\begin{bmatrix}1&3\\2&7\end{bmatrix}\begin{bmatrix}a\\c\end{bmatrix}=\begin{bmatrix}1\\0\end{bmatrix}$

$\begin{bmatrix}1&3\\2&7\end{bmatrix}\begin{bmatrix}b\\d\end{bmatrix}=\begin{bmatrix}0\\1\end{bmatrix}$

So we are back to solving systems of equations.

Gauss-Jordan / Find $A^{-1}$

Solve 2 equations at once

$\begin{bmatrix}1&3\\2&7\end{bmatrix}\begin{bmatrix}a\\b\end{bmatrix}=\begin{bmatrix}1\\0\end{bmatrix}$

$\begin{bmatrix}1&3\\2&7\end{bmatrix}\begin{bmatrix}c\\d\end{bmatrix}=\begin{bmatrix}0\\1\end{bmatrix}$

The gauss-Jordan Method solved both equations at once by created an augmented matrix.

We will now do elimination steps to get the identify on the left. This will convert A to A inverse.

  1. Start with $AI$
  2. Elimination step subtracting 2 of the first row from the second row
  3. Elinination step subtracting 3 of the second row from the first.
  4. End with $IA^{-1}$

$ \left[ \begin{array}{cc|cc} 1 & 3 & 1 & 0 \\ 2 & 7 & 0 & 1 \\ \end{array} \right] => \left[ \begin{array}{cc|cc} 1 & 3 & 1 & 0 \\ 0 & 1 & -2 & 1 \\ \end{array} \right] => \left[ \begin{array}{cc|cc} 1 & 0 & 7 & -3 \\ 0 & 1 & -2 & 1 \\ \end{array} \right] $

Naturally we get the inverse because we really reverse the sides of the equation. Just like in algebra if you have -5 on one side, you can move it by putting the inverse on the other side (+5).