Matrix Inverse and Determinant

Matrix Inverse

We will now turn our attention to solving simultaneous equations. Elimination and substitution are the typical methods we employ to solve simultaneous equations. It turns out matrix multiplication offers another approach we can obtain a solution.

This relies on one property of matrix called matrix inverse. Multiplying a matrix by its inverse will result in an identity matrix.

A1A=I A^{-1}*A = I

where A1A^{-1} is the inverse of matrix A and I is the identity matrix.

To solve a simultaneous equation Ar=sA*r=s for vector rr, we can rearrange the equation as follows

Ar=sA1(Ar)=A1sr=A1s \begin{aligned} A*r &= s \\ A^{-1}*(A*r)&=A^{-1}*s \\ r&=A^{-1}*s \end{aligned}

Therefore, solution for vector rr can be obtained by multiplying A1A^{-1} with ss.

However, finding matrix inverse is a non-trivial task. There exists one shortcut to calculate matrix inverse if it is a 2 by 2 square matrix.

(abcd)1=1adbc(dbca)(1) \begin{pmatrix}a&b\\c&d\end{pmatrix}^{-1}=\frac{1}{ad-bc}\begin{pmatrix}d&-b\\-c&a\end{pmatrix} \tag{1}

To find inverse of matrix in higher dimensions, QR decomposition could be one approach. But this is out of scope for our discussion here.

Matrix Determinant

One concept closely related to matrix inverse is matrix determinant. For a matrix A=(abcd)A=\begin{pmatrix}a&b\\c&d\end{pmatrix}, we can draw a parallelogram with vectors (ac)\begin{pmatrix}a\\c\end{pmatrix} and (bd)\begin{pmatrix}b\\d\end{pmatrix}. The determinant of this matrix is then defined as the area of this parallelogram.

Mathematics Basics - Linear Algebra (Matrix Part 2)

This area is calculated by formula below.

A=(a+b)(c+d)acbd2bc=adbc \begin{aligned} |A| &= (a+b)*(c+d)-ac-bd-2bc\\ &=ad-bc \end{aligned}

The symbol for determinant is two vertical bars (|), just like the modulus operator for vectors.

Recall that adbcad-bc is also the term we used in matrix inverse calculation for a 2 by 2 matrix (equation 1). Therefore our matrix inverse formula can be simplified as

A1=1A(dbca)(2) A^{-1}=\frac{1}{|A|}*\begin{pmatrix}d&-b\\-c&a\end{pmatrix} \tag{2}

We can also see that not all matrices are invertible. In order to find the inverse of a matrix, we have to compute its determinant first. However, for a matrix like (1212)\begin{pmatrix}1&2\\1&2\end{pmatrix}, the determinant is 1221=01*2-2*1=0. It will result in a division by 0 when we substitute the determinant value to equation (2). This is because for a matrix to have non-zero determinant, the basis vectors of that matrix must be linearly independent. In our example 2 by 2 matrix AA, (ac)\begin{pmatrix}a\\c\end{pmatrix} and (bd)\begin{pmatrix}b\\d\end{pmatrix} must not lie on the same line to have a valid matrix inverse. In the simultaneous equation Ar=sA*r=s, there exists an infinite number of solutions for vector rr if matrix AA has no inverse.

Matrices Changing Basis

Changing Basis in General

We are going to revisit the topic of changing basis here after we have grasped the concept of matrix transformation on vectors.

First, let’s define 2 new basis vectors b1b_1 and b2b_2 where b1=(31)b_1=\begin{pmatrix}3\\1\end{pmatrix} and b2=(11)b_2=\begin{pmatrix}1\\1\end{pmatrix}. Recall from our matrix transformation. The new basis vectors b1b_1 and b2b_2 are in fact the transformation of basis vectors e1=(10)e_1=\begin{pmatrix}1\\0\end{pmatrix} and e2=(01)e_2=\begin{pmatrix}0\\1\end{pmatrix} by matrix (3111)\begin{pmatrix}3&1\\1&1\end{pmatrix}. Note b1b_1 and b2b_2 can also be expressed in the basis of e1e_1 and e2e_2.

b1=(3111)(10)=(31)=3e1+1e2b2=(3111)(01)=(11)=1e1+1e2 b_1=\begin{pmatrix}3&1\\1&1\end{pmatrix}*\begin{pmatrix}1\\0\end{pmatrix}=\begin{pmatrix}3\\1\end{pmatrix}=3e_1+1e_2\\ b_2=\begin{pmatrix}3&1\\1&1\end{pmatrix}*\begin{pmatrix}0\\1\end{pmatrix}=\begin{pmatrix}1\\1\end{pmatrix}=1e_1+1e_2

Now we have a vector rr that is defined in b1b_1 and b2b_2 basis as r=32b1+12b2r=\frac{3}{2}b_1+\frac{1}{2}b_2. How do we get the same vector rr expressed in e1e_1 and e2e_2 basis? We can substitute in vectors b1b_1 and b2b_2 in e1e_1 and e2e_2 basis.

rE=32b1+12b2=32(31)+12(11)=(52) \begin{aligned} r_E&=\frac{3}{2}b_1+\frac{1}{2}b_2\\ &=\frac{3}{2}\begin{pmatrix}3\\1\end{pmatrix}+\frac{1}{2}\begin{pmatrix}1\\1\end{pmatrix}\\ &=\begin{pmatrix}5\\2\end{pmatrix} \end{aligned}

Alternatively, we can convert the vector r=32b1+12b2r=\frac{3}{2}b_1+\frac{1}{2}b_2 from b1b_1 and b2b_2 basis to e1e_1 and e2e_2 basis by multiplying the transformation matrix (3111)\begin{pmatrix}3&1\\1&1\end{pmatrix}.

rE=(3111)(3212)=(52) \begin{aligned} r_E&=\begin{pmatrix}3&1\\1&1\end{pmatrix}*\begin{pmatrix}\frac{3}{2}\\\frac{1}{2}\end{pmatrix}\\ &=\begin{pmatrix}5\\2\end{pmatrix} \end{aligned}

This is illustrated in graph below. Note the expression for vectors in e1e_1 and e2e_2 basis is colored black while that for vectors in b1b_1 and b2b_2 basis is colored orange.

Mathematics Basics - Linear Algebra (Matrix Part 2)

That is how we convert a vector from b1b_1 and b2b_2 basis to e1e_1 and e2e_2 basis. But what is more interesting is to convert a vector from e1e_1 and e2e_2 basis to b1b_1 and b2b_2 basis. This should somehow “reverse” our previous process.

We first need to find out where e1e_1 and e2e_2 are in b1b_1 and b2b_2 basis. This is where matrix inverse comes into play.

(3111)1=131(1113)=12(1113) \begin{pmatrix}3&1\\1&1\end{pmatrix}^{-1}=\frac{1}{3-1}\begin{pmatrix}1&-1\\-1&3\end{pmatrix}=\frac{1}{2}\begin{pmatrix}1&-1\\-1&3\end{pmatrix}

Therefore, e1=12b112b2e_1=\frac{1}{2}b_1-\frac{1}{2}b_2 and e2=12b1+32b2e_2=-\frac{1}{2}b_1+\frac{3}{2}b_2 in b1b_1 and b2b_2 basis. We can verify this by substituting the values of b1b_1 and b2b_2 back to get the original e1e_1 and e2e_2.

e1=12b112b2=12(31)12(11)=(10)e2=12b1+32b2=12(31)+32(11)=(01) e_1=\frac{1}{2}b_1-\frac{1}{2}b_2 =\frac{1}{2}\begin{pmatrix}3\\1\end{pmatrix}-\frac{1}{2}\begin{pmatrix}1\\1\end{pmatrix} =\begin{pmatrix}1\\0\end{pmatrix}\\ e_2=-\frac{1}{2}b_1+\frac{3}{2}b_2 =-\frac{1}{2}\begin{pmatrix}3\\1\end{pmatrix}+\frac{3}{2}\begin{pmatrix}1\\1\end{pmatrix} =\begin{pmatrix}0\\1\end{pmatrix}

Now we are ready to convert vector r=(52)r=\begin{pmatrix}5\\2\end{pmatrix} from e1e_1 and e2e_2 basis to b1b_1 and b2b_2 basis.

rB=(3111)1(52)=12(1113)(52)=(3212) \begin{aligned} r_B&=\begin{pmatrix}3&1\\1&1\end{pmatrix}^{-1}*\begin{pmatrix}5\\2\end{pmatrix}\\ &=\frac{1}{2}\begin{pmatrix}1&-1\\-1&3\end{pmatrix}*\begin{pmatrix}5\\2\end{pmatrix}\\ &=\begin{pmatrix}\frac{3}{2}\\\frac{1}{2}\end{pmatrix} \end{aligned}

This demonstrates a complete loop of vector conversion between 2 sets of basis vectors e1e_1, e2e_2 and b1b_1, b2b_2. We can add the expression for all the discussed vectors in graph below.

Mathematics Basics - Linear Algebra (Matrix Part 2)

To sum it up, we always need to find the matrix representation of current basis vectors in the basis of target vector space. If we want to convert a vector from b1b_1 and b2b_2 basis to e1e_1 and e2e_2 basis, we need to find matrix (3111)\begin{pmatrix}3&1\\1&1\end{pmatrix} which is b1b_1 and b2b_2 in e1e_1 and e2e_2 basis. Conversely, if we want to convert a vector from e1e_1 and e2e_2 basis to b1b_1 and b2b_2 basis, we need to find matrix 12(1113)\frac{1}{2}\begin{pmatrix}1&-1\\-1&3\end{pmatrix} which is e1e_1 and e2e_2 in b1b_1 and b2b_2 basis. Then we use this matrix to multiply with the vector in current vector space to get the converted vector in target vector space.

Transformation matrix provides us a way to change vector basis in a general case. We have learned previously that when the target basis vectors b1b_1 and b2b_2 are orthogonal to each other, there is an easier approach without computing this transformation matrix. We can obtain the new expression of a vector by projecting it onto target basis vector b1b_1 and b2b_2 directly.

For example, we have our orthogonal target basis vectors b1=12(11)b_1=\frac{1}{\sqrt2}\begin{pmatrix}1\\1\end{pmatrix} and b2=12(11)b_2=\frac{1}{\sqrt2}\begin{pmatrix}-1\\1\end{pmatrix}. To verify that b1b_1 and b2b_2 are orthogonal to each other, we can compute dot product b1b2=0b_1\cdot b_2=0. A vector r=12(13)r=\frac{1}{\sqrt2}\begin{pmatrix}1\\3\end{pmatrix} originally in e1e_1 and e2e_2 basis can then be converted to b1b_1 and b2b_2 basis as:

rb1b11b1=12(13)12(11)12(11)12(11)=4222=2 \begin{aligned} \frac{r\cdot b_1}{|b_1|}*\frac{1}{|b_1|} &=\frac{\frac{1}{\sqrt2}\begin{pmatrix}1\\3\end{pmatrix}\cdot \frac{1}{\sqrt2}\begin{pmatrix}1\\1\end{pmatrix}}{\frac{1}{\sqrt2}\begin{pmatrix}1\\1\end{pmatrix}\cdot \frac{1}{\sqrt2}\begin{pmatrix}1\\1\end{pmatrix}}\\ &=\frac{\frac{4}{2}}{\frac{2}{2}}\\ &=2 \end{aligned}

rb2b21b2=12(13)12(11)12(11)12(11)=2222=1 \begin{aligned} \frac{r\cdot b_2}{|b_2|}*\frac{1}{|b_2|} &=\frac{\frac{1}{\sqrt2}\begin{pmatrix}1\\3\end{pmatrix}\cdot\frac{1}{\sqrt2}\begin{pmatrix}-1\\1\end{pmatrix}}{\frac{1}{\sqrt2}\begin{pmatrix}-1\\1\end{pmatrix}\cdot\frac{1}{\sqrt2}\begin{pmatrix}-1\\1\end{pmatrix}}\\ &=\frac{\frac{2}{2}}{\frac{2}{2}}\\ &=1 \end{aligned}

Therefore, r=2b1+1b2r=2b_1+1b_2 in b1b_1 and b2b_2 basis.

Let’s verify this result by our new transformation matrix method.

rB=(12(1111))112(13)=12(1111)(13)=12(42)=(21)=2b1+1b2 \begin{aligned} r_B&=\left(\frac{1}{\sqrt2}\begin{pmatrix}1&-1\\1&1\end{pmatrix}\right)^{-1}*\frac{1}{\sqrt2}\begin{pmatrix}1\\3\end{pmatrix}\\ &=\frac{1}{2}\begin{pmatrix}1&1\\-1&1\end{pmatrix}*\begin{pmatrix}1\\3\end{pmatrix}\\ &=\frac{1}{2}\begin{pmatrix}4\\2\end{pmatrix}\\ &=\begin{pmatrix}2\\1\end{pmatrix}\\ &=2b_1+1b_2 \end{aligned}

So we know how to perform a change in basis when the target basis vectors are orthogonal to each other. We also know how to perform such change in a general case by using matrix inverse. This is a very important technique. We will use it often in solving other matrix related problems.

Doing Transformation in a Changed Basis

There is one more extension to our matrices changing basis concept. Going back to our previous example of new basis vectors b1=(31)b_1=\begin{pmatrix}3\\1\end{pmatrix}, b2=(11)b_2=\begin{pmatrix}1\\1\end{pmatrix}. For the vector r=(3212)r=\begin{pmatrix}\frac{3}{2}\\\frac{1}{2}\end{pmatrix} in b1b_1 and b2b_2 basis, we want to find a vector rr' that is the result of rotating rr by 90° anti-clockwise. The difficult part is this rotation happens not in the original e1e_1 and e2e_2 basis, but is referenced to b1b_1 and b2b_2 basis. How shall we do that?

We might not know how to express the 90° anti-clockwise rotation transformation matrix in b1b_1 and b2b_2 basis. Nonetheless, we know how to do this rotation in our original basis vector e1e_1 and e2e_2. This rotation transformation matrix is given by TE=(0110)T_E=\begin{pmatrix}0&-1\\1&0\end{pmatrix} (recall from our previous discussion on matrix transformation). So here is what we can do to accomplish our goal.

We first convert the vector rr into e1e_1 and e2e_2 basis by transformation matrix B=(3111)B=\begin{pmatrix}3&1\\1&1\end{pmatrix}.

rE=Br=(3111)(3212)=(52) \begin{aligned} r_E&=B*r\\ &=\begin{pmatrix}3&1\\1&1\end{pmatrix}*\begin{pmatrix}\frac{3}{2}\\\frac{1}{2}\end{pmatrix}\\ &=\begin{pmatrix}5\\2\end{pmatrix} \end{aligned}

Then we perform a rotation transformation for vector rEr_E in the e1e_1 and e2e_2 vector space.

rE=TErE=(0110)(52)=(25) \begin{aligned} r'_E&=T_E*r_E\\ &=\begin{pmatrix}0&-1\\1&0\end{pmatrix}*\begin{pmatrix}5\\2\end{pmatrix}\\ &=\begin{pmatrix}-2\\5\end{pmatrix} \end{aligned}

Lastly we convert rEr'_E back to the b1b_1 an b2b_2 basis with matrix inverse B1B^{-1}.

rB=B1rE=(3111)1rE=12(1113)(25)=(72172) \begin{aligned} r'_B&=B^{-1}*r'_E\\ &=\begin{pmatrix}3&1\\1&1\end{pmatrix}^{-1}*r'_E\\ &=\frac{1}{2}\begin{pmatrix}1&-1\\-1&3\end{pmatrix}*\begin{pmatrix}-2\\5\end{pmatrix}\\ &=\begin{pmatrix}-\frac{7}{2}\\\frac{17}{2}\end{pmatrix} \end{aligned}

Therefore, the original vector r=(3212)r=\begin{pmatrix}\frac{3}{2}\\\frac{1}{2}\end{pmatrix} in b1b_1 and b2b_2 basis after being rotated by 90° anti-clockwise becomes r=(72172)r'=\begin{pmatrix}-\frac{7}{2}\\\frac{17}{2}\end{pmatrix} in the same basis. This is plotted in graph below.

Mathematics Basics - Linear Algebra (Matrix Part 2)

In general, if we have our basis changing matrix BB and desired transformation matrix TT in normal basis, we can perform a linear transformation in the changed basis from vector rr to rr' by:
r=B1TBr r'=B^{-1}*T*B*r
This idea of doing a transformation in a changed basis might be hard to grasp. But it is a critical step to set us up for further machine learning concept. For example, Principle Component Analysis (PCA) will often make use of different basis vectors. It would be helpful if you can think over this entire process and have a solid understanding about it.

Orthogonal Matrices

Define an Orthogonal Matrix

In order to define an orthogonal matrix, we need to first introduce the concept of matrix transpose. If we interchange all the elements of the rows and columns in a matrix, the resultant matrix is called the transpose of original matrix. We denote this operation by symbol tt.

For example, we have a matrix A=(1234)A=\begin{pmatrix}1&2\\3&4\end{pmatrix}. The transpose of A is therefore
At=(1324) A^t=\begin{pmatrix}1&3\\2&4\end{pmatrix}
where elements that are off the diagonal are interchanged.

Let’s define a special n by n square matrix AA, where
A=((a1)(a2)(an)) A=\Bigg(\bigg(a_1\bigg)\bigg(a_2\bigg)\cdots\bigg(a_n\bigg)\Bigg)
Each column aia_i in AA is a vector perpendicular to other column vectors aja_j, aiaj=0ija_i\cdot a_j=0 \forall i\neq j. And all the column vectors have unit length 1, ai=1|a_i|=1.

Now something interesting happens. We can multiply matrix AA by its transpose matrix AtA^t,

AtA=((a1)t(a2)t(an)t)((a1)(a2)(an))=(100010001) \begin{aligned} A^t*A&=\begin{pmatrix}(a_1)^t\\(a_2)^t\\\vdots\\(a_n)^t\end{pmatrix} *\Bigg(\bigg(a_1\bigg)\bigg(a_2\bigg)\cdots\bigg(a_n\bigg)\Bigg)\\ &=\begin{pmatrix}1&0&\cdots&0\\ 0&1&\cdots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&1\end{pmatrix} \end{aligned}

We get an n by n identity matrix as a result. This means AtA^t is also an inverse of AA.

Therefore, we can define an orthogonal matrix as one consisting of a set of unit length basis vectors that are all perpendicular to each other.

Since all the basis vectors are of length 1, the determinant of an orthogonal matrix must be either +1 or -1. We can derive the matrix determinant below.

1=det(I)=det(AtA)=det(At)det(A)=det(A)2 \begin{aligned} 1&=det(I)\\ &=det(A^t*A)\\ &=det(A^t)*det(A)\\ &=det(A)^2 \end{aligned}

Therefore,

det(A)=±1 det(A)=\pm1

Whether the determinant is +1 or -1 depends on how we permute the column vectors in A, but this is beyond our discussion here.

We can derive another property of orthogonal matrix, making use of the fact that AtA^t is inverse of AA.

AtA=IA(AtA)=AI(AAt)A=AAAt=I((a1)(a2)(an))((a1)t(a2)t(an)t)=I \begin{aligned} A^t*A&=I\\ A*(A^t*A)&=A*I\\ (A*A^t)*A&=A\\ A*A^t&=I\\ \Bigg(\bigg(a_1\bigg)\bigg(a_2\bigg)\cdots\bigg(a_n\bigg)\Bigg)*\begin{pmatrix}(a_1)^t\\(a_2)^t\\\vdots\\(a_n)^t\end{pmatrix} &=I \end{aligned}

The multiplication of rows of AA and columns of AtA^t is still an identity matrix. This shows that the row vectors of matrix AA must also be perpendicular to each other. Therefore, AtA^t is also an orthogonal matrix.

Recall from our previous discussion on changing basis, the vector in a new vector space can be easily computed by vector projection provided the new basis vectors are all perpendicular to each other. This is exactly what we get here with orthogonal matrix. Each column in the orthogonal matrix can be treated as a basis vector and they are perpendicular to each other. Since the unit length is 1, result of vector projection is simply the dot product (rbibi1bi=rbi,i1,2,,n\frac{r\cdot b_i}{|b_i|}*\frac{1}{|b_i|}=r\cdot b_i,i\in 1,2,\cdots,n).

Therefore in data science we would like to use an orthogonal matrix as the basis vector set for transforming our data. There are a few advantages of doing so.

  1. Matrix inverse can be computed easily because A1=AtA^{-1}=A^t.
  2. Matrix transformation is reversible because it does not collapse space (det(A)=1det(A) = 1).
  3. Changing basis can be computed easily with vector projection.

How to Construct an Orthogonal Matrix

We already know that it’s convenient if our computation involves orthogonal matrix. But how do we get an orthogonal matrix? We are going to walk through the process of constructing an orthogonal matrix here. This process is called Gram-Schmidt process.

We first define a set of vectors V={v1,v2,,vn}V=\{v_1,v_2,\cdots,v_n\} in which all vectors are linearly independent of each other (verify by computing the determinants of every pair of vectors). However, they are neither perpendicular to each other nor of unit length at this stage. We are going to construct a set of orthogonal basis vectors out of this vector set VV.

We start with the vector v1v_1 and define our first basis vector e1e_1 as

e1=v1v1 e_1=\frac{v_1}{|v_1|}

So e1e_1 is just the normalized vector of v1v_1 (with length 1).

For the second vector v1v_1, we can treat it as the sum of two vectors. One is in the same direction as e1e_1 and the other one is perpendicular to e1e_1 (v2=e1+e1v_2=e_{1\parallel}+e_{1\perp}). To find the vector in the same direction as e1e_1, we do a projection of v2v_2 onto e1e_1.

e1=v2e1e1e1e1 e_{1\parallel}=\frac{v_2\cdot e_1}{|e_1|}*\frac{e_1}{|e_1|}

Since e1e_1 has length 1, e1=1|e_1|=1.

e1=(v2e1)e1 e_{1\parallel}=(v_2\cdot e_1)*e_1

Then the vector perpendicular to e1e_1 can be calculated by subtracting the parallel component (v2e1)e1(v_2\cdot e_1)e_1 from v2v_2. We denote this vector as u2u_2. So

u2=v2(v2e1)e1 u_2=v_2-(v_2\cdot e_1)*e_1

The second basis vector e2e_2 is thus obtained by normalizing u2u_2.

e2=u2u2 e_2=\frac{u_2}{|u_2|}

Let’s move on to the third vector v3v_3. To find the vector u3u_3 that is perpendicular to both e1e_1 and e2e_2, we subtract from v3v_3 the respective components of v3v_3 which are in the same direction as e1e_1 and e2e_2.

u3=v3(v3e1)e1(v3e2)e2 u_3=v_3-(v_3\cdot e_1)e_1-(v_3\cdot e_2)e_2

Then we find the basis vector e3e_3 by normalizing u3u_3.

e3=u3u3 e_3=\frac{u_3}{|u_3|}

We can continue the same process for more vectors v4v_4, v5v_5, … vnv_n until we find a set of vectors are all perpendicular to each other and with unit length 1. This set of vectors then forms the orthogonal matrix we need.

Let’s put together everything we have learned so far with a concrete example.

We have 3 vectors, v1=(111)v_1=\begin{pmatrix}1\\1\\1\end{pmatrix}, v2=(201)v_2=\begin{pmatrix}2\\0\\1\end{pmatrix} and v3=(311)v_3=\begin{pmatrix}3\\1\\-1\end{pmatrix} that define a 3-D space (verify these three vectors are linearly independent). There is a vector r=(235)r=\begin{pmatrix}2\\3\\5\end{pmatrix} lying in the same space defined by v1v_1, v2v_2 and v3v_3. Our task is to find a vector rr' that is a mirror reflection of vector rr by the 2-D plane defined by vectors v1v_1 and v2v_2.

This problem seems complicated because it is very hard to find a transformation matrix that reflects a vector by v1v_1 and v2v_2 plane directly. However, making use of what we have already learned this problem can be broken down into these steps:

  1. Define an orthogonal matrix EE consisting of basis vectors e1e_1, e2e_2 and e3e_3 from v1v_1, v2v_2 and v3v_3.
  2. Convert vector rr to rEr_E in a new vector space defined by EE.
  3. Perform mirror reflection of rEr_E in this new space to get the transformed vector rEr'_E.
  4. Convert rEr'_E back to the original space as rr'.

We start by creating the orthogonal matrix EE with Gram-Schmidt process.

The first basis vector e1e_1 is just the normalized v1v_1.

e1=v1v1=13(111) e_1=\frac{v_1}{|v_1}|=\frac{1}{\sqrt3}\begin{pmatrix}1\\1\\1\end{pmatrix}

Then we find the vector u2u_2 perpendicular to e1e_1.

u2=v2(v2e2)e1=(201)13((201)(111))13(111)=(201)33(111)=(110) \begin{aligned} u_2&=v_2-(v_2\cdot e_2)e_1\\ &=\begin{pmatrix}2\\0\\1\end{pmatrix}-\frac{1}{\sqrt3}\left(\begin{pmatrix}2\\0\\1\end{pmatrix}\cdot\begin{pmatrix}1\\1\\1\end{pmatrix}\right)*\frac{1}{\sqrt3}\begin{pmatrix}1\\1\\1\end{pmatrix}\\ &=\begin{pmatrix}2\\0\\1\end{pmatrix}-\frac{3}{3}\begin{pmatrix}1\\1\\1\end{pmatrix}\\ &=\begin{pmatrix}1\\-1\\0\end{pmatrix} \end{aligned}

The second basis vector e2e_2 is obtained by normalizing u2u_2.

e2=u2u2=12(110) e_2=\frac{u_2}{|u_2|}=\frac{1}{\sqrt2}\begin{pmatrix}1\\-1\\0\end{pmatrix}

We continue this process to find u3u_3 that is perpendicular to both e1e_1 and e2e_2.

u3=v3(v3e1)e1(v3e2)e2=(311)13((311)(111))13(111)12((311)(110))12(110)=(311)33(111)22(110)=(112) \begin{aligned} u_3&=v_3-(v_3\cdot e_1)e_1-(v_3\cdot e_2)e_2\\ &=\begin{pmatrix}3\\1\\-1\end{pmatrix}-\frac{1}{\sqrt3}\left(\begin{pmatrix}3\\1\\-1\end{pmatrix}\cdot\begin{pmatrix}1\\1\\1\end{pmatrix}\right)*\frac{1}{\sqrt3}\begin{pmatrix}1\\1\\1\end{pmatrix}-\frac{1}{\sqrt2}\left(\begin{pmatrix}3\\1\\-1\end{pmatrix}\cdot\begin{pmatrix}1\\-1\\0\end{pmatrix}\right)*\frac{1}{\sqrt2}\begin{pmatrix}1\\-1\\0\end{pmatrix}\\ &=\begin{pmatrix}3\\1\\-1\end{pmatrix}-\frac{3}{3}\begin{pmatrix}1\\1\\1\end{pmatrix}-\frac{2}{2}\begin{pmatrix}1\\-1\\0\end{pmatrix}\\ &=\begin{pmatrix}1\\1\\-2\end{pmatrix} \end{aligned}

Lastly, we normalize u3u_3 to get e3e_3.

e3=u3u3=16(112) e_3=\frac{u_3}{|u_3|}=\frac{1}{\sqrt6}\begin{pmatrix}1\\1\\-2\end{pmatrix}

So we have our new vector space defined by the orthogonal matrix E=(13(111)12(110)16(112))E=\begin{pmatrix}\frac{1}{\sqrt3}\begin{pmatrix}1\\1\\1\end{pmatrix}&\frac{1}{\sqrt2}\begin{pmatrix}1\\-1\\0\end{pmatrix}&\frac{1}{\sqrt6}\begin{pmatrix}1\\1\\-2\end{pmatrix}\end{pmatrix}.

Our vector rr in this vector space is rEr_E.

rE=E1r=(13(111)12(110)16(112))(235)=(1031256) \begin{aligned} r_E&=E^{-1}*r\\ &=\begin{pmatrix}\frac{1}{\sqrt3}\begin{pmatrix}1&1&1\end{pmatrix}\\\frac{1}{\sqrt2}\begin{pmatrix}1&-1&0\end{pmatrix}\\\frac{1}{\sqrt6}\begin{pmatrix}1&1&-2\end{pmatrix}\end{pmatrix}*\begin{pmatrix}2\\3\\5\end{pmatrix}\\ &=\begin{pmatrix}\frac{10}{\sqrt3}\\-\frac{1}{\sqrt2}\\-\frac{5}{\sqrt6}\end{pmatrix} \end{aligned}

In the basis of e1e_1, e2e_2 and e3e_3, a reflection matrix can be defined as TE=(100010001)T_E=\begin{pmatrix}1&0&0\\0&1&0\\0&0&-1\end{pmatrix} because the values in e1e_1 and e2e_2 directions remain the same and those in e3e_3 direction is inverted.

Let’s do the reflection of rEr_E by transformation matrix TET_E.

rE=TErE=(100010001)(1031256)=(1031256) \begin{aligned} r'_E&=T_E*r_E\\ &=\begin{pmatrix}1&0&0\\0&1&0\\0&0&-1\end{pmatrix}*\begin{pmatrix}\frac{10}{\sqrt3}\\-\frac{1}{\sqrt2}\\-\frac{5}{\sqrt6}\end{pmatrix}\\ &=\begin{pmatrix}\frac{10}{\sqrt3}\\-\frac{1}{\sqrt2}\\\frac{5}{\sqrt6}\end{pmatrix} \end{aligned}

The reflected vector rEr'_E can then be transformed back to its original space.

r=ErE=(13(111)12(110)16(112))(1031256)=(10312+56103+12+56103+0106)=13(11145) \begin{aligned} r'&=E*r'_E\\ &=\begin{pmatrix}\frac{1}{\sqrt3}\begin{pmatrix}1\\1\\1\end{pmatrix}&\frac{1}{\sqrt2}\begin{pmatrix}1\\-1\\0\end{pmatrix}&\frac{1}{\sqrt6}\begin{pmatrix}1\\1\\-2\end{pmatrix}\end{pmatrix}*\begin{pmatrix}\frac{10}{\sqrt3}\\-\frac{1}{\sqrt2}\\\frac{5}{\sqrt6}\end{pmatrix}\\ &=\begin{pmatrix}\frac{10}{3}-\frac{1}{2}+\frac{5}{6}\\\frac{10}{3}+\frac{1}{2}+\frac{5}{6}\\\frac{10}{3}+0-\frac{10}{6}\end{pmatrix}\\ &=\frac{1}{3}\begin{pmatrix}11\\14\\5\end{pmatrix} \end{aligned}

This wraps up our discussion on matrix inverse and matrix transformation. It is really fun and useful. We can apply this technique to a lot of image related problems in machine learning which requires transformation of shape, orientation, position, etc. It also sets us up for the next topic of Eigen vectors and Eigen values.


(Inspired by Mathematics for Machine Learning lecture series from Imperial College London)

相关文章: