Geometry of Linear Equations (18.06_L1)
Linear Algebra - An introduction to Systems of Linear Equations
- Motivation
- Goal
- Matrix Multiplication (Ax=b)
- 2 equations 2 unknowns
- Equations, 3 equations 3 unknowns
- Numpy Solver
import matplotlib.pyplot as plt
from torch import tensor
from torch import solve
import numpy as np
from mpl_toolkits import mplot3d
from mpl_toolkits.mplot3d import Axes3D
Motivation
One of my goals is to understand more deeply what Neural Networks are doing. Another is to have an easier time understanding and implementing cutting edge academic papers. In order to work toward those goals, I am revisiting the Math behind Neural Networks. This time my goal is to understand intuitively every piece of the material forward and backward - rather than to pass a course on a deadline.
This blog post will be my notes about Lecture 1 from the following course:
Gilbert Strang. 18.06 Linear Algebra. Spring 2010. Massachusetts Institute of Technology: MIT OpenCourseWare, https://ocw.mit.edu. License: Creative Commons BY-NC-SA.
How do we multiply these together?
Matrix Multiplication (Ax=b)
$\begin{bmatrix} 2 & 5 \\ 1 & 3 \end{bmatrix}$ $\begin{bmatrix} 1 \\ 2 \end{bmatrix}$ $=1$ $\begin{bmatrix} 2 \\ 1 \end{bmatrix}$ $+2$ $\begin{bmatrix} 5 \\ 3 \end{bmatrix}$ $=$ $\begin{bmatrix} 12\\7 \end{bmatrix}$
Ax is a linear combination of columns
def plot_equations_2d(x_range,y_dict):
for y in y_dict:
plt.plot(x, y_dict[y], label=y)
plt.xlabel('x', color='#1C2833')
plt.ylabel('y', color='#1C2833')
plt.legend(loc='upper left')
plt.grid()
plt.show()
x = tensor(np.linspace(-4,4,100))
y_dict = {'2x-y=0':2*x,
'-x+2y=3':(3 + x)/2}
plot_equations_2d(x,y_dict)
$x$ $\begin{bmatrix} 2 \\ -1 \end{bmatrix}$ $+y$ $\begin{bmatrix} -1 \\ 2 \end{bmatrix}$ $=$ $\begin{bmatrix} 0 \\ 3 \end{bmatrix}$
strt_pts = tensor([[0,0],[2,-1]])
end_pts = tensor([[2,-1],[1,2]])
diff = end_pts - strt_pts
plt.ylim([-3, 3])
plt.xlim([-3, 3])
plt.quiver(strt_pts[:,0], strt_pts[:,1], diff[:,0], diff[:,1],
angles='xy', scale_units='xy', scale=1.)
plt.show()
strt_pts = tensor([[0,0],[2,-1],[1,1]])
end_pts = tensor([[2,-1],[1,1],[0,3]])
diff = end_pts - strt_pts
plt.ylim([-3, 3])
plt.xlim([-3, 3])
plt.quiver(strt_pts[:,0], strt_pts[:,1], diff[:,0], diff[:,1],
angles='xy', scale_units='xy', scale=1.)
plt.show()
$\begin{bmatrix} 2 & -1 \\ -1 & 2 \end{bmatrix}$ $\begin{bmatrix} x \\ y \end{bmatrix}$ $=$ $\begin{bmatrix} 0 \\ 3 \end{bmatrix}$
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x, z = tensor(np.linspace(-8,8,100)), tensor(np.linspace(-8,8,100))
X, Z = np.meshgrid(x,z)
Y1 = 2*X
Y2 = (-1 + X + Z) / 2
Y3 = (4*Z - 4)/3
ax.plot_surface(X,Y1,Z, alpha=0.5, rstride=100, cstride=100)
ax.plot_surface(X,Y2,Z, alpha=0.5, rstride=100, cstride=100)
ax.plot_surface(X,Y3,Z, alpha=0.5, rstride=100, cstride=100)
plt.show()
$x$ $\begin{bmatrix} 2 \\ -1 \\ 0 \end{bmatrix}$ $+y$ $\begin{bmatrix} -1 \\ 2 \\ -3 \end{bmatrix}$ $+z$ $\begin{bmatrix} 0 \\ -1 \\ 4 \end{bmatrix}$ $=$ $\begin{bmatrix} 0 \\ -1 \\ 4 \end{bmatrix}$
strt_pts = tensor([[0,0,0],[0,0,0],[0,0,0]])
end_pts = tensor([[2,-1,0],[-1,2,-1],[0,-3,4]])
diff = end_pts - strt_pts
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.set_xlim([-5, 5])
ax.set_ylim([-5, 5])
ax.set_zlim([-5, 5])
plt.quiver(strt_pts[0,0],
strt_pts[0,1],
strt_pts[0,2],
end_pts[0,0],
end_pts[0,1],
end_pts[0,2])
plt.quiver(strt_pts[1,0],
strt_pts[1,1],
strt_pts[1,2],
end_pts[1,0],
end_pts[1,1],
end_pts[1,2])
plt.quiver(strt_pts[2,0],
strt_pts[2,1],
strt_pts[2,2],
end_pts[2,0],
end_pts[2,1],
end_pts[2,2])
plt.show()
a = np.array([[2, -1, 0], [-1, 2, -1], [0, -3, 4]])
b = np.array([0, -1, 4])
x = np.linalg.solve(a, b)
print(x)
length_b = 20
b = np.array([list(np.random.rand(length_b)*10),
list(np.random.rand(length_b)*10),
list(np.random.rand(length_b)*10)])
for x in range(0,length_b):
x = np.linalg.solve(a, b[:,x])
print(x)