Mathematics Basics - Linear Algebra (Matrix Part 2)

Matrix Inverse and Determinant

Matrix Inverse

We will now turn our attention to solving simultaneous equations. Elimination and substitution are the typical methods we employ to solve simultaneous equations. It turns out matrix multiplication offers another approach we can obtain a solution.

This relies on one property of matrix called matrix inverse. Multiplying a matrix by its inverse will result in an identity matrix.

$A^{-1}*A = I$

where $A^{-1}$ is the inverse of matrix A and I is the identity matrix.

To solve a simultaneous equation $A*r=s$ for vector $r$ , we can rearrange the equation as follows

$\begin{aligned} A*r &= s \\ A^{-1}*(A*r)&=A^{-1}*s \\ r&=A^{-1}*s \end{aligned}$

Therefore, solution for vector $r$ can be obtained by multiplying $A^{-1}$ with $s$ .

However, finding matrix inverse is a non-trivial task. There exists one shortcut to calculate matrix inverse if it is a 2 by 2 square matrix.

$\begin{pmatrix}a&b\\c&d\end{pmatrix}^{-1}=\frac{1}{ad-bc}\begin{pmatrix}d&-b\\-c&a\end{pmatrix} \tag{1}$

To find inverse of matrix in higher dimensions, QR decomposition could be one approach. But this is out of scope for our discussion here.

Matrix Determinant

One concept closely related to matrix inverse is matrix determinant. For a matrix $A=\begin{pmatrix}a&b\\c&d\end{pmatrix}$ , we can draw a parallelogram with vectors $\begin{pmatrix}a\\c\end{pmatrix}$ and $\begin{pmatrix}b\\d\end{pmatrix}$ . The determinant of this matrix is then defined as the area of this parallelogram.

Mathematics Basics - Linear Algebra (Matrix Part 2)

This area is calculated by formula below.

$\begin{aligned} |A| &= (a+b)*(c+d)-ac-bd-2bc\\ &=ad-bc \end{aligned}$

The symbol for determinant is two vertical bars (|), just like the modulus operator for vectors.

Recall that $ad-bc$ is also the term we used in matrix inverse calculation for a 2 by 2 matrix (equation 1). Therefore our matrix inverse formula can be simplified as

$A^{-1}=\frac{1}{|A|}*\begin{pmatrix}d&-b\\-c&a\end{pmatrix} \tag{2}$

We can also see that not all matrices are invertible. In order to find the inverse of a matrix, we have to compute its determinant first. However, for a matrix like $\begin{pmatrix}1&2\\1&2\end{pmatrix}$ , the determinant is $1*2-2*1=0$ . It will result in a division by 0 when we substitute the determinant value to equation (2). This is because for a matrix to have non-zero determinant, the basis vectors of that matrix must be linearly independent. In our example 2 by 2 matrix $A$ , $\begin{pmatrix}a\\c\end{pmatrix}$ and $\begin{pmatrix}b\\d\end{pmatrix}$ must not lie on the same line to have a valid matrix inverse. In the simultaneous equation $A*r=s$ , there exists an infinite number of solutions for vector $r$ if matrix $A$ has no inverse.

Matrices Changing Basis

Changing Basis in General

We are going to revisit the topic of changing basis here after we have grasped the concept of matrix transformation on vectors.

First, let’s define 2 new basis vectors $b_1$ and $b_2$ where $b_1=\begin{pmatrix}3\\1\end{pmatrix}$ and $b_2=\begin{pmatrix}1\\1\end{pmatrix}$ . Recall from our matrix transformation. The new basis vectors $b_1$ and $b_2$ are in fact the transformation of basis vectors $e_1=\begin{pmatrix}1\\0\end{pmatrix}$ and $e_2=\begin{pmatrix}0\\1\end{pmatrix}$ by matrix $\begin{pmatrix}3&1\\1&1\end{pmatrix}$ . Note $b_1$ and $b_2$ can also be expressed in the basis of $e_1$ and $e_2$ .

$b_1=\begin{pmatrix}3&1\\1&1\end{pmatrix}*\begin{pmatrix}1\\0\end{pmatrix}=\begin{pmatrix}3\\1\end{pmatrix}=3e_1+1e_2\\ b_2=\begin{pmatrix}3&1\\1&1\end{pmatrix}*\begin{pmatrix}0\\1\end{pmatrix}=\begin{pmatrix}1\\1\end{pmatrix}=1e_1+1e_2$

Now we have a vector $r$ that is defined in $b_1$ and $b_2$ basis as $r=\frac{3}{2}b_1+\frac{1}{2}b_2$ . How do we get the same vector $r$ expressed in $e_1$ and $e_2$ basis? We can substitute in vectors $b_1$ and $b_2$ in $e_1$ and $e_2$ basis.

$\begin{aligned} r_E&=\frac{3}{2}b_1+\frac{1}{2}b_2\\ &=\frac{3}{2}\begin{pmatrix}3\\1\end{pmatrix}+\frac{1}{2}\begin{pmatrix}1\\1\end{pmatrix}\\ &=\begin{pmatrix}5\\2\end{pmatrix} \end{aligned}$

Alternatively, we can convert the vector $r=\frac{3}{2}b_1+\frac{1}{2}b_2$ from $b_1$ and $b_2$ basis to $e_1$ and $e_2$ basis by multiplying the transformation matrix $\begin{pmatrix}3&1\\1&1\end{pmatrix}$ .

$\begin{aligned} r_E&=\begin{pmatrix}3&1\\1&1\end{pmatrix}*\begin{pmatrix}\frac{3}{2}\\\frac{1}{2}\end{pmatrix}\\ &=\begin{pmatrix}5\\2\end{pmatrix} \end{aligned}$

This is illustrated in graph below. Note the expression for vectors in $e_1$ and $e_2$ basis is colored black while that for vectors in $b_1$ and $b_2$ basis is colored orange.

Mathematics Basics - Linear Algebra (Matrix Part 2)

That is how we convert a vector from $b_1$ and $b_2$ basis to $e_1$ and $e_2$ basis. But what is more interesting is to convert a vector from $e_1$ and $e_2$ basis to $b_1$ and $b_2$ basis. This should somehow “reverse” our previous process.

We first need to find out where $e_1$ and $e_2$ are in $b_1$ and $b_2$ basis. This is where matrix inverse comes into play.

$\begin{pmatrix}3&1\\1&1\end{pmatrix}^{-1}=\frac{1}{3-1}\begin{pmatrix}1&-1\\-1&3\end{pmatrix}=\frac{1}{2}\begin{pmatrix}1&-1\\-1&3\end{pmatrix}$

Therefore, $e_1=\frac{1}{2}b_1-\frac{1}{2}b_2$ and $e_2=-\frac{1}{2}b_1+\frac{3}{2}b_2$ in $b_1$ and $b_2$ basis. We can verify this by substituting the values of $b_1$ and $b_2$ back to get the original $e_1$ and $e_2$ .

$e_1=\frac{1}{2}b_1-\frac{1}{2}b_2 =\frac{1}{2}\begin{pmatrix}3\\1\end{pmatrix}-\frac{1}{2}\begin{pmatrix}1\\1\end{pmatrix} =\begin{pmatrix}1\\0\end{pmatrix}\\ e_2=-\frac{1}{2}b_1+\frac{3}{2}b_2 =-\frac{1}{2}\begin{pmatrix}3\\1\end{pmatrix}+\frac{3}{2}\begin{pmatrix}1\\1\end{pmatrix} =\begin{pmatrix}0\\1\end{pmatrix}$

Now we are ready to convert vector $r=\begin{pmatrix}5\\2\end{pmatrix}$ from $e_1$ and $e_2$ basis to $b_1$ and $b_2$ basis.

$\begin{aligned} r_B&=\begin{pmatrix}3&1\\1&1\end{pmatrix}^{-1}*\begin{pmatrix}5\\2\end{pmatrix}\\ &=\frac{1}{2}\begin{pmatrix}1&-1\\-1&3\end{pmatrix}*\begin{pmatrix}5\\2\end{pmatrix}\\ &=\begin{pmatrix}\frac{3}{2}\\\frac{1}{2}\end{pmatrix} \end{aligned}$

This demonstrates a complete loop of vector conversion between 2 sets of basis vectors $e_1$ , $e_2$ and $b_1$ , $b_2$ . We can add the expression for all the discussed vectors in graph below.

Mathematics Basics - Linear Algebra (Matrix Part 2)

To sum it up, we always need to find the matrix representation of current basis vectors in the basis of target vector space. If we want to convert a vector from $b_1$ and $b_2$ basis to $e_1$ and $e_2$ basis, we need to find matrix $\begin{pmatrix}3&1\\1&1\end{pmatrix}$ which is $b_1$ and $b_2$ in $e_1$ and $e_2$ basis. Conversely, if we want to convert a vector from $e_1$ and $e_2$ basis to $b_1$ and $b_2$ basis, we need to find matrix $\frac{1}{2}\begin{pmatrix}1&-1\\-1&3\end{pmatrix}$ which is $e_1$ and $e_2$ in $b_1$ and $b_2$ basis. Then we use this matrix to multiply with the vector in current vector space to get the converted vector in target vector space.

Transformation matrix provides us a way to change vector basis in a general case. We have learned previously that when the target basis vectors $b_1$ and $b_2$ are orthogonal to each other, there is an easier approach without computing this transformation matrix. We can obtain the new expression of a vector by projecting it onto target basis vector $b_1$ and $b_2$ directly.

For example, we have our orthogonal target basis vectors $b_1=\frac{1}{\sqrt2}\begin{pmatrix}1\\1\end{pmatrix}$ and $b_2=\frac{1}{\sqrt2}\begin{pmatrix}-1\\1\end{pmatrix}$ . To verify that $b_1$ and $b_2$ are orthogonal to each other, we can compute dot product $b_1\cdot b_2=0$ . A vector $r=\frac{1}{\sqrt2}\begin{pmatrix}1\\3\end{pmatrix}$ originally in $e_1$ and $e_2$ basis can then be converted to $b_1$ and $b_2$ basis as:

$\begin{aligned} \frac{r\cdot b_1}{|b_1|}*\frac{1}{|b_1|} &=\frac{\frac{1}{\sqrt2}\begin{pmatrix}1\\3\end{pmatrix}\cdot \frac{1}{\sqrt2}\begin{pmatrix}1\\1\end{pmatrix}}{\frac{1}{\sqrt2}\begin{pmatrix}1\\1\end{pmatrix}\cdot \frac{1}{\sqrt2}\begin{pmatrix}1\\1\end{pmatrix}}\\ &=\frac{\frac{4}{2}}{\frac{2}{2}}\\ &=2 \end{aligned}$

$\begin{aligned} \frac{r\cdot b_2}{|b_2|}*\frac{1}{|b_2|} &=\frac{\frac{1}{\sqrt2}\begin{pmatrix}1\\3\end{pmatrix}\cdot\frac{1}{\sqrt2}\begin{pmatrix}-1\\1\end{pmatrix}}{\frac{1}{\sqrt2}\begin{pmatrix}-1\\1\end{pmatrix}\cdot\frac{1}{\sqrt2}\begin{pmatrix}-1\\1\end{pmatrix}}\\ &=\frac{\frac{2}{2}}{\frac{2}{2}}\\ &=1 \end{aligned}$

Therefore, $r=2b_1+1b_2$ in $b_1$ and $b_2$ basis.

Let’s verify this result by our new transformation matrix method.

$\begin{aligned} r_B&=\left(\frac{1}{\sqrt2}\begin{pmatrix}1&-1\\1&1\end{pmatrix}\right)^{-1}*\frac{1}{\sqrt2}\begin{pmatrix}1\\3\end{pmatrix}\\ &=\frac{1}{2}\begin{pmatrix}1&1\\-1&1\end{pmatrix}*\begin{pmatrix}1\\3\end{pmatrix}\\ &=\frac{1}{2}\begin{pmatrix}4\\2\end{pmatrix}\\ &=\begin{pmatrix}2\\1\end{pmatrix}\\ &=2b_1+1b_2 \end{aligned}$

So we know how to perform a change in basis when the target basis vectors are orthogonal to each other. We also know how to perform such change in a general case by using matrix inverse. This is a very important technique. We will use it often in solving other matrix related problems.

Doing Transformation in a Changed Basis

There is one more extension to our matrices changing basis concept. Going back to our previous example of new basis vectors $b_1=\begin{pmatrix}3\\1\end{pmatrix}$ , $b_2=\begin{pmatrix}1\\1\end{pmatrix}$ . For the vector $r=\begin{pmatrix}\frac{3}{2}\\\frac{1}{2}\end{pmatrix}$ in $b_1$ and $b_2$ basis, we want to find a vector $r'$ that is the result of rotating $r$ by 90° anti-clockwise. The difficult part is this rotation happens not in the original $e_1$ and $e_2$ basis, but is referenced to $b_1$ and $b_2$ basis. How shall we do that?

We might not know how to express the 90° anti-clockwise rotation transformation matrix in $b_1$ and $b_2$ basis. Nonetheless, we know how to do this rotation in our original basis vector $e_1$ and $e_2$ . This rotation transformation matrix is given by $T_E=\begin{pmatrix}0&-1\\1&0\end{pmatrix}$ (recall from our previous discussion on matrix transformation). So here is what we can do to accomplish our goal.

We first convert the vector $r$ into $e_1$ and $e_2$ basis by transformation matrix $B=\begin{pmatrix}3&1\\1&1\end{pmatrix}$ .

$\begin{aligned} r_E&=B*r\\ &=\begin{pmatrix}3&1\\1&1\end{pmatrix}*\begin{pmatrix}\frac{3}{2}\\\frac{1}{2}\end{pmatrix}\\ &=\begin{pmatrix}5\\2\end{pmatrix} \end{aligned}$

Then we perform a rotation transformation for vector $r_E$ in the $e_1$ and $e_2$ vector space.

$\begin{aligned} r'_E&=T_E*r_E\\ &=\begin{pmatrix}0&-1\\1&0\end{pmatrix}*\begin{pmatrix}5\\2\end{pmatrix}\\ &=\begin{pmatrix}-2\\5\end{pmatrix} \end{aligned}$

Lastly we convert $r'_E$ back to the $b_1$ an $b_2$ basis with matrix inverse $B^{-1}$ .

$\begin{aligned} r'_B&=B^{-1}*r'_E\\ &=\begin{pmatrix}3&1\\1&1\end{pmatrix}^{-1}*r'_E\\ &=\frac{1}{2}\begin{pmatrix}1&-1\\-1&3\end{pmatrix}*\begin{pmatrix}-2\\5\end{pmatrix}\\ &=\begin{pmatrix}-\frac{7}{2}\\\frac{17}{2}\end{pmatrix} \end{aligned}$

Therefore, the original vector $r=\begin{pmatrix}\frac{3}{2}\\\frac{1}{2}\end{pmatrix}$ in $b_1$ and $b_2$ basis after being rotated by 90° anti-clockwise becomes $r'=\begin{pmatrix}-\frac{7}{2}\\\frac{17}{2}\end{pmatrix}$ in the same basis. This is plotted in graph below.

Mathematics Basics - Linear Algebra (Matrix Part 2)

In general, if we have our basis changing matrix $B$ and desired transformation matrix $T$ in normal basis, we can perform a linear transformation in the changed basis from vector $r$ to $r'$ by:
$r'=B^{-1}*T*B*r$
This idea of doing a transformation in a changed basis might be hard to grasp. But it is a critical step to set us up for further machine learning concept. For example, Principle Component Analysis (PCA) will often make use of different basis vectors. It would be helpful if you can think over this entire process and have a solid understanding about it.

Orthogonal Matrices

Define an Orthogonal Matrix

In order to define an orthogonal matrix, we need to first introduce the concept of matrix transpose. If we interchange all the elements of the rows and columns in a matrix, the resultant matrix is called the transpose of original matrix. We denote this operation by symbol $t$ .

For example, we have a matrix $A=\begin{pmatrix}1&2\\3&4\end{pmatrix}$ . The transpose of A is therefore
$A^t=\begin{pmatrix}1&3\\2&4\end{pmatrix}$
where elements that are off the diagonal are interchanged.

Let’s define a special n by n square matrix $A$ , where
$A=\Bigg(\bigg(a_1\bigg)\bigg(a_2\bigg)\cdots\bigg(a_n\bigg)\Bigg)$
Each column $a_i$ in $A$ is a vector perpendicular to other column vectors $a_j$ , $a_i\cdot a_j=0 \forall i\neq j$ . And all the column vectors have unit length 1, $|a_i|=1$ .

Now something interesting happens. We can multiply matrix $A$ by its transpose matrix $A^t$ ,

$\begin{aligned} A^t*A&=\begin{pmatrix}(a_1)^t\\(a_2)^t\\\vdots\\(a_n)^t\end{pmatrix} *\Bigg(\bigg(a_1\bigg)\bigg(a_2\bigg)\cdots\bigg(a_n\bigg)\Bigg)\\ &=\begin{pmatrix}1&0&\cdots&0\\ 0&1&\cdots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&1\end{pmatrix} \end{aligned}$

We get an n by n identity matrix as a result. This means $A^t$ is also an inverse of $A$ .

Therefore, we can define an orthogonal matrix as one consisting of a set of unit length basis vectors that are all perpendicular to each other.

Since all the basis vectors are of length 1, the determinant of an orthogonal matrix must be either +1 or -1. We can derive the matrix determinant below.

$\begin{aligned} 1&=det(I)\\ &=det(A^t*A)\\ &=det(A^t)*det(A)\\ &=det(A)^2 \end{aligned}$

Therefore,

$det(A)=\pm1$

Whether the determinant is +1 or -1 depends on how we permute the column vectors in A, but this is beyond our discussion here.

We can derive another property of orthogonal matrix, making use of the fact that $A^t$ is inverse of $A$ .

$\begin{aligned} A^t*A&=I\\ A*(A^t*A)&=A*I\\ (A*A^t)*A&=A\\ A*A^t&=I\\ \Bigg(\bigg(a_1\bigg)\bigg(a_2\bigg)\cdots\bigg(a_n\bigg)\Bigg)*\begin{pmatrix}(a_1)^t\\(a_2)^t\\\vdots\\(a_n)^t\end{pmatrix} &=I \end{aligned}$

The multiplication of rows of $A$ and columns of $A^t$ is still an identity matrix. This shows that the row vectors of matrix $A$ must also be perpendicular to each other. Therefore, $A^t$ is also an orthogonal matrix.

Recall from our previous discussion on changing basis, the vector in a new vector space can be easily computed by vector projection provided the new basis vectors are all perpendicular to each other. This is exactly what we get here with orthogonal matrix. Each column in the orthogonal matrix can be treated as a basis vector and they are perpendicular to each other. Since the unit length is 1, result of vector projection is simply the dot product ( $\frac{r\cdot b_i}{|b_i|}*\frac{1}{|b_i|}=r\cdot b_i,i\in 1,2,\cdots,n$ ).

Therefore in data science we would like to use an orthogonal matrix as the basis vector set for transforming our data. There are a few advantages of doing so.

Matrix inverse can be computed easily because $A^{-1}=A^t$ .
Matrix transformation is reversible because it does not collapse space ( $det(A) = 1$ ).
Changing basis can be computed easily with vector projection.

How to Construct an Orthogonal Matrix

We already know that it’s convenient if our computation involves orthogonal matrix. But how do we get an orthogonal matrix? We are going to walk through the process of constructing an orthogonal matrix here. This process is called Gram-Schmidt process.

We first define a set of vectors $V=\{v_1,v_2,\cdots,v_n\}$ in which all vectors are linearly independent of each other (verify by computing the determinants of every pair of vectors). However, they are neither perpendicular to each other nor of unit length at this stage. We are going to construct a set of orthogonal basis vectors out of this vector set $V$ .

We start with the vector $v_1$ and define our first basis vector $e_1$ as

$e_1=\frac{v_1}{|v_1|}$

So $e_1$ is just the normalized vector of $v_1$ (with length 1).

For the second vector $v_1$ , we can treat it as the sum of two vectors. One is in the same direction as $e_1$ and the other one is perpendicular to $e_1$ ( $v_2=e_{1\parallel}+e_{1\perp}$ ). To find the vector in the same direction as $e_1$ , we do a projection of $v_2$ onto $e_1$ .

$e_{1\parallel}=\frac{v_2\cdot e_1}{|e_1|}*\frac{e_1}{|e_1|}$

Since $e_1$ has length 1, $|e_1|=1$ .

$e_{1\parallel}=(v_2\cdot e_1)*e_1$

Then the vector perpendicular to $e_1$ can be calculated by subtracting the parallel component $(v_2\cdot e_1)e_1$ from $v_2$ . We denote this vector as $u_2$ . So

$u_2=v_2-(v_2\cdot e_1)*e_1$

The second basis vector $e_2$ is thus obtained by normalizing $u_2$ .

$e_2=\frac{u_2}{|u_2|}$

Let’s move on to the third vector $v_3$ . To find the vector $u_3$ that is perpendicular to both $e_1$ and $e_2$ , we subtract from $v_3$ the respective components of $v_3$ which are in the same direction as $e_1$ and $e_2$ .

$u_3=v_3-(v_3\cdot e_1)e_1-(v_3\cdot e_2)e_2$

Then we find the basis vector $e_3$ by normalizing $u_3$ .

$e_3=\frac{u_3}{|u_3|}$

We can continue the same process for more vectors $v_4$ , $v_5$ , … $v_n$ until we find a set of vectors are all perpendicular to each other and with unit length 1. This set of vectors then forms the orthogonal matrix we need.

Let’s put together everything we have learned so far with a concrete example.

We have 3 vectors, $v_1=\begin{pmatrix}1\\1\\1\end{pmatrix}$ , $v_2=\begin{pmatrix}2\\0\\1\end{pmatrix}$ and $v_3=\begin{pmatrix}3\\1\\-1\end{pmatrix}$ that define a 3-D space (verify these three vectors are linearly independent). There is a vector $r=\begin{pmatrix}2\\3\\5\end{pmatrix}$ lying in the same space defined by $v_1$ , $v_2$ and $v_3$ . Our task is to find a vector $r'$ that is a mirror reflection of vector $r$ by the 2-D plane defined by vectors $v_1$ and $v_2$ .

This problem seems complicated because it is very hard to find a transformation matrix that reflects a vector by $v_1$ and $v_2$ plane directly. However, making use of what we have already learned this problem can be broken down into these steps:

Define an orthogonal matrix $E$ consisting of basis vectors $e_1$ , $e_2$ and $e_3$ from $v_1$ , $v_2$ and $v_3$ .
Convert vector $r$ to $r_E$ in a new vector space defined by $E$ .
Perform mirror reflection of $r_E$ in this new space to get the transformed vector $r'_E$ .
Convert $r'_E$ back to the original space as $r'$ .

We start by creating the orthogonal matrix $E$ with Gram-Schmidt process.

The first basis vector $e_1$ is just the normalized $v_1$ .

$e_1=\frac{v_1}{|v_1}|=\frac{1}{\sqrt3}\begin{pmatrix}1\\1\\1\end{pmatrix}$

Then we find the vector $u_2$ perpendicular to $e_1$ .

$\begin{aligned} u_2&=v_2-(v_2\cdot e_2)e_1\\ &=\begin{pmatrix}2\\0\\1\end{pmatrix}-\frac{1}{\sqrt3}\left(\begin{pmatrix}2\\0\\1\end{pmatrix}\cdot\begin{pmatrix}1\\1\\1\end{pmatrix}\right)*\frac{1}{\sqrt3}\begin{pmatrix}1\\1\\1\end{pmatrix}\\ &=\begin{pmatrix}2\\0\\1\end{pmatrix}-\frac{3}{3}\begin{pmatrix}1\\1\\1\end{pmatrix}\\ &=\begin{pmatrix}1\\-1\\0\end{pmatrix} \end{aligned}$

The second basis vector $e_2$ is obtained by normalizing $u_2$ .

$e_2=\frac{u_2}{|u_2|}=\frac{1}{\sqrt2}\begin{pmatrix}1\\-1\\0\end{pmatrix}$

We continue this process to find $u_3$ that is perpendicular to both $e_1$ and $e_2$ .

$\begin{aligned} u_3&=v_3-(v_3\cdot e_1)e_1-(v_3\cdot e_2)e_2\\ &=\begin{pmatrix}3\\1\\-1\end{pmatrix}-\frac{1}{\sqrt3}\left(\begin{pmatrix}3\\1\\-1\end{pmatrix}\cdot\begin{pmatrix}1\\1\\1\end{pmatrix}\right)*\frac{1}{\sqrt3}\begin{pmatrix}1\\1\\1\end{pmatrix}-\frac{1}{\sqrt2}\left(\begin{pmatrix}3\\1\\-1\end{pmatrix}\cdot\begin{pmatrix}1\\-1\\0\end{pmatrix}\right)*\frac{1}{\sqrt2}\begin{pmatrix}1\\-1\\0\end{pmatrix}\\ &=\begin{pmatrix}3\\1\\-1\end{pmatrix}-\frac{3}{3}\begin{pmatrix}1\\1\\1\end{pmatrix}-\frac{2}{2}\begin{pmatrix}1\\-1\\0\end{pmatrix}\\ &=\begin{pmatrix}1\\1\\-2\end{pmatrix} \end{aligned}$ u3=v3−(v3⋅e1)e1−(v3⋅e2)e2=⎝⎛31−1⎠⎞−31⎝⎛⎝⎛31−1⎠⎞⋅⎝⎛111⎠⎞⎠⎞∗31⎝⎛111⎠⎞−21⎝⎛⎝⎛31−1⎠⎞⋅⎝⎛1−10⎠⎞⎠⎞∗21⎝⎛1−10⎠⎞=⎝⎛31−1⎠⎞−33⎝⎛111⎠⎞−22⎝⎛1−10⎠⎞=⎝⎛11−2⎠⎞

Lastly, we normalize $u_3$ to get $e_3$ .

$e_3=\frac{u_3}{|u_3|}=\frac{1}{\sqrt6}\begin{pmatrix}1\\1\\-2\end{pmatrix}$

So we have our new vector space defined by the orthogonal matrix $E=\begin{pmatrix}\frac{1}{\sqrt3}\begin{pmatrix}1\\1\\1\end{pmatrix}&\frac{1}{\sqrt2}\begin{pmatrix}1\\-1\\0\end{pmatrix}&\frac{1}{\sqrt6}\begin{pmatrix}1\\1\\-2\end{pmatrix}\end{pmatrix}$ .

Our vector $r$ in this vector space is $r_E$ .

$\begin{aligned} r_E&=E^{-1}*r\\ &=\begin{pmatrix}\frac{1}{\sqrt3}\begin{pmatrix}1&1&1\end{pmatrix}\\\frac{1}{\sqrt2}\begin{pmatrix}1&-1&0\end{pmatrix}\\\frac{1}{\sqrt6}\begin{pmatrix}1&1&-2\end{pmatrix}\end{pmatrix}*\begin{pmatrix}2\\3\\5\end{pmatrix}\\ &=\begin{pmatrix}\frac{10}{\sqrt3}\\-\frac{1}{\sqrt2}\\-\frac{5}{\sqrt6}\end{pmatrix} \end{aligned}$

In the basis of $e_1$ , $e_2$ and $e_3$ , a reflection matrix can be defined as $T_E=\begin{pmatrix}1&0&0\\0&1&0\\0&0&-1\end{pmatrix}$ because the values in $e_1$ and $e_2$ directions remain the same and those in $e_3$ direction is inverted.

Let’s do the reflection of $r_E$ by transformation matrix $T_E$ .

$\begin{aligned} r'_E&=T_E*r_E\\ &=\begin{pmatrix}1&0&0\\0&1&0\\0&0&-1\end{pmatrix}*\begin{pmatrix}\frac{10}{\sqrt3}\\-\frac{1}{\sqrt2}\\-\frac{5}{\sqrt6}\end{pmatrix}\\ &=\begin{pmatrix}\frac{10}{\sqrt3}\\-\frac{1}{\sqrt2}\\\frac{5}{\sqrt6}\end{pmatrix} \end{aligned}$

The reflected vector $r'_E$ can then be transformed back to its original space.

$\begin{aligned} r'&=E*r'_E\\ &=\begin{pmatrix}\frac{1}{\sqrt3}\begin{pmatrix}1\\1\\1\end{pmatrix}&\frac{1}{\sqrt2}\begin{pmatrix}1\\-1\\0\end{pmatrix}&\frac{1}{\sqrt6}\begin{pmatrix}1\\1\\-2\end{pmatrix}\end{pmatrix}*\begin{pmatrix}\frac{10}{\sqrt3}\\-\frac{1}{\sqrt2}\\\frac{5}{\sqrt6}\end{pmatrix}\\ &=\begin{pmatrix}\frac{10}{3}-\frac{1}{2}+\frac{5}{6}\\\frac{10}{3}+\frac{1}{2}+\frac{5}{6}\\\frac{10}{3}+0-\frac{10}{6}\end{pmatrix}\\ &=\frac{1}{3}\begin{pmatrix}11\\14\\5\end{pmatrix} \end{aligned}$

This wraps up our discussion on matrix inverse and matrix transformation. It is really fun and useful. We can apply this technique to a lot of image related problems in machine learning which requires transformation of shape, orientation, position, etc. It also sets us up for the next topic of Eigen vectors and Eigen values.

(Inspired by Mathematics for Machine Learning lecture series from Imperial College London)