6.3 Orthogonal projections (正交投影)

Given a vector y \boldsymbol y y and a subspace W W W in R n \mathbb R^n Rn, there is a vector y ^ \hat \boldsymbol y y^ in W W W such that (1) y ^ \hat \boldsymbol y y^ is the unique vector in W W W for which y − y ^ \boldsymbol y -\hat\boldsymbol y y−y^ is orthogonal to W W W , and (2) y ^ \hat\boldsymbol y y^ is the unique vector in W W W closest to y \boldsymbol y y. See Figure 1.

6.3 Orthogonal projections (正交投影)
EXAMPLE 1
Let { u 1 , . . . , u 5 } \{\boldsymbol u_1,...,\boldsymbol u_5\} {u1,...,u5} be an orthogonal basis for R 5 \mathbb R^5 R5 and let

6.3 Orthogonal projections (正交投影)
Consider the subspace W = S p a n { u 1 , u 2 } W = Span \{\boldsymbol u_1,\boldsymbol u_2\} W=Span{u1,u2}, and write y \boldsymbol y y as the sum of a vector z 1 \boldsymbol z_1 z1 in W W W and a vector z 2 \boldsymbol z_2 z2 in W ⊥ W^\perp W⊥.
SOLUTION

6.3 Orthogonal projections (正交投影)

The next theorem shows that the decomposition y = z 1 + z 2 \boldsymbol y =\boldsymbol z_1 +\boldsymbol z_2 y=z1+z2 in Example 1 can be computed without having an orthogonal basis for R n \mathbb R^n Rn. It is enough to have an orthogonal basis only for W W W .

6.3 Orthogonal projections (正交投影)
The vector y ^ \hat \boldsymbol y y^ in (1) is called the orthogonal projection of y \boldsymbol y y onto W W W and often is written as p r o j W y proj_W\boldsymbol y projWy. See Figure 2. When W is a one-dimensional subspace, the formula for y ^ \hat\boldsymbol y y^ matches the formula given in Section 6.2.

6.3 Orthogonal projections (正交投影)
PROOF

We may assume that W W W is not the zero subspace, for otherwise W ⊥ = R n W^\perp=\mathbb R^n W⊥=Rn and (1) is simply y = 0 + y \boldsymbol y =\boldsymbol 0+\boldsymbol y y=0+y. The next section will show that any nonzero subspace of R n \mathbb R^n Rn has an orthogonal basis.

Let { u 1 , . . . , u p } \{\boldsymbol u_1,...,\boldsymbol u_p\} {u1,...,up} be any orthogonal basis for W W W , and define y ^ \hat\boldsymbol y y^ by (2). Let z = y − y ^ \boldsymbol z =\boldsymbol y -\hat\boldsymbol y z=y−y^, then

6.3 Orthogonal projections (正交投影)
Thus z \boldsymbol z z is orthogonal to u 1 \boldsymbol u_1 u1. Similarly, z \boldsymbol z z is orthogonal to each u j \boldsymbol u_j uj in the basis for W W W. Hence z \boldsymbol z z is orthogonal to every vector in W W W . That is, z \boldsymbol z z is in W ⊥ W^\perp W⊥.

To show that the decomposition in (1) is unique, suppose y \boldsymbol y y can also be written as y = y ^ 1 + z 1 \boldsymbol y =\hat\boldsymbol y_1 +\boldsymbol z_1 y=y^1+z1, with y ^ 1 \hat\boldsymbol y_1 y^1 in W W W and z 1 \boldsymbol z_1 z1 in W ⊥ W^\perp W⊥. Then y ^ + z = y ^ 1 + z 1 \hat\boldsymbol y+\boldsymbol z =\hat\boldsymbol y_1+\boldsymbol z_1 y^+z=y^1+z1, and so
y ^ − y ^ 1 = z 1 − z \hat\boldsymbol y-\hat\boldsymbol y _1=\boldsymbol z_1-\boldsymbol z y^−y^1=z1−z

This equality shows that the vector v = y ^ − y ^ 1 \boldsymbol v =\hat\boldsymbol y-\hat\boldsymbol y_1 v=y^−y^1 is in W W W and in W ⊥ W^\perp W⊥. Hence v ⋅ v = 0 \boldsymbol v\cdot \boldsymbol v = 0 v⋅v=0, which shows that v = 0 \boldsymbol v =\boldsymbol 0 v=0. This proves that y ^ = y ^ 1 \hat\boldsymbol y=\hat\boldsymbol y_1 y^=y^1 and also z 1 = z \boldsymbol z_1 =\boldsymbol z z1=z.

EXERCISES
Suppose that { u 1 , u 2 } \{\boldsymbol u_1, \boldsymbol u_2\} {u1,u2} is an orthogonal set of nonzero vectors in R 3 \mathbb R^3 R3. How would you find an orthogonal basis of R 3 \mathbb R^3 R3 that contains u 1 \boldsymbol u_1 u1 and u 2 \boldsymbol u_2 u2?
SOLUTION
First, find a vector v \boldsymbol v v in R 3 \mathbb R^3 R3 that is not in the subspace W W W spanned by u 1 \boldsymbol u_1 u1 and u 2 \boldsymbol u_2 u2. Let u 3 = v − p r o j W v \boldsymbol u_3=\boldsymbol v-proj_W\boldsymbol v u3=v−projWv, then { u 1 , u 2 , u 3 } \{\boldsymbol u_1, \boldsymbol u_2, \boldsymbol u_3\} {u1,u2,u3} is an orthogonal basis.

EXERCISES 23
Let A A A be an m × n m \times n m×n matrix. Prove that every vector x \boldsymbol x x in R n \mathbb R^n Rn can be written in the form x = p + u \boldsymbol x=\boldsymbol p +\boldsymbol u x=p+u, where p \boldsymbol p p is in R o w A RowA RowA and u \boldsymbol u u is in N u l A NulA NulA. Also, show that if the equation A x = b A\boldsymbol x =\boldsymbol b Ax=b is consistent, then there is a unique p \boldsymbol p p in R o w A RowA RowA such that A p = b A\boldsymbol p=\boldsymbol b Ap=b.
SOLUTION
By the Orthogonal Decomposition Theorem, each x \boldsymbol x x in R n \mathbb R^n Rn can be written uniquely as x = p + u \boldsymbol x = \boldsymbol p + \boldsymbol u x=p+u, with p \boldsymbol p p in R o w A Row A RowA and u \boldsymbol u u in ( R o w A ) ⊥ (Row A)^\perp (RowA)⊥. By Theorem 3 in Section 6.1, u \boldsymbol u u is in N u l A Nul A NulA.
Next, suppose that A x = b A\boldsymbol x = \boldsymbol b Ax=b is consistent. Let x \boldsymbol x x be a solution, and write x = p + u \boldsymbol x = \boldsymbol p +\boldsymbol u x=p+u, as above. Then A p = A ( x – u ) = A x – A u = b – 0 = b A\boldsymbol p = A(\boldsymbol x – \boldsymbol u) = A\boldsymbol x – A\boldsymbol u = \boldsymbol b – \boldsymbol 0 = \boldsymbol b Ap=A(x–u)=Ax–Au=b–0=b. So the equation A x = b A\boldsymbol x = \boldsymbol b Ax=b has at least one solution p \boldsymbol p p in R o w A Row A RowA.
Finally, suppose that p \boldsymbol p p and p 1 \boldsymbol p_1 p1 are both in R o w A Row A RowA and satisfy A x = b A\boldsymbol x = \boldsymbol b Ax=b. Then p – p 1 \boldsymbol p – \boldsymbol p_1 p–p1 is in N u l A Nul A NulA because A ( p – p 1 ) = A p – A p 1 = b – b = 0 A (\boldsymbol p – \boldsymbol p_1) = A\boldsymbol p – A\boldsymbol p_1 = \boldsymbol b – \boldsymbol b = \boldsymbol 0 A(p–p1)=Ap–Ap1=b–b=0

The equations p = p 1 + ( p – p 1 ) \boldsymbol p = \boldsymbol p_1 + (\boldsymbol p – \boldsymbol p_1) p=p1+(p–p1) and p = p + 0 \boldsymbol p = \boldsymbol p + \boldsymbol 0 p=p+0 both decompose p \boldsymbol p p as the sum of a vector in R o w A Row A RowA and a vector in ( R o w A ) T (Row A)^T (RowA)T. By the uniqueness of the orthogonal decomposition (Theorem 8), p 1 = p \boldsymbol p_1 = \boldsymbol p p1=p, so p \boldsymbol p p is unique.

A Geometric Interpretation of the Orthogonal Projection

When W W W is a one-dimensional subspace, the formula (2) for p r o j W y proj_W \boldsymbol y projWy contains just one term. Thus, when d i m W > 1 dimW > 1 dimW>1, each term in (2) is itself an orthogonal projection of y \boldsymbol y y onto a one-dimensional subspace spanned by one of the u \boldsymbol u u’s in the basis for W W W . Figure 3
illustrates this when W W W is a subspace of R 3 \mathbb R^3 R3 spanned by u 1 \boldsymbol u_1 u1 and u 2 \boldsymbol u_2 u2.

6.3 Orthogonal projections (正交投影)

Properties of Orthogonal Projections

6.3 Orthogonal projections (正交投影)
This fact also follows from the next theorem.

6.3 Orthogonal projections (正交投影)

最佳逼近定理

The vector y \boldsymbol y y in Theorem 9 is called the best approximation to y \boldsymbol y y by elements of W W W( W W W 中元素对 y \boldsymbol y y 的最佳逼近).

Later sections in the text will examine problems where a given y \boldsymbol y y must be replaced, or approximated, by a vector v \boldsymbol v v in some fixed subspace W W W . The distance ∥ y − v ∥ \left\|\boldsymbol y-\boldsymbol v\right\| ∥y−v∥, can be regarded as the “error” of using v \boldsymbol v v in place of y \boldsymbol y y. Theorem 9 says that this error is minimized when v = y ^ \boldsymbol v =\hat\boldsymbol y v=y^.

Inequality (3) leads to a new proof that y ^ \hat\boldsymbol y y^ does not depend on the particular orthogonal basis used to compute it.

PROOF
Take v \boldsymbol v v in W W W distinct from y ^ \hat\boldsymbol y y^. See Figure 4. Then y − y ^ \boldsymbol y -\hat \boldsymbol y y−y^ is orthogonal to y ^ − v \hat\boldsymbol y-\boldsymbol v y^−v (which is in W W W ). Since

6.3 Orthogonal projections (正交投影)
the Pythagorean Theorem(勾股定理) gives

6.3 Orthogonal projections (正交投影)
Now ∥ y ^ − v ∥ > 0 \left\|\hat\boldsymbol y -\boldsymbol v\right\| > 0 ∥y^−v∥>0 , and so inequality (3) follows immediately.

6.3 Orthogonal projections (正交投影)

The final theorem in this section shows how formula (2) for p r o j W y proj_W \boldsymbol y projWy is simplified when the basis for W W W is an orthonormal set.

6.3 Orthogonal projections (正交投影)

Suppose U U U is an n × p n \times p n×p matrix with orthonormal columns, and let W W W be the column space of U U U. Then

6.3 Orthogonal projections (正交投影)
EXAMPLE
Let W W W be a subspace of R n \mathbb R^n Rn. Let x \boldsymbol x x and y \boldsymbol y y be vectors in R n \mathbb R^n Rn and let z = x + y \boldsymbol z =\boldsymbol x + \boldsymbol y z=x+y. If u \boldsymbol u u is the projection of x \boldsymbol x x onto W W W and v \boldsymbol v v is the projection of y \boldsymbol y y onto W W W , show that u + v \boldsymbol u + \boldsymbol v u+v is the projection of z \boldsymbol z z onto W W W .
SOLUTION
Let U U U be a matrix whose columns consist of an orthonormal basis for W W W . Then
p r o j W z = U U T z = U U T ( x + y ) = U U T x + U U T y = p r o j W x + p r o j W y = u + v \begin{aligned}proj_W\boldsymbol z &= UU^T\boldsymbol z \\&= UU^T (\boldsymbol x + \boldsymbol y)\\&= UU^T \boldsymbol x + UU^T \boldsymbol y \\&= proj_W \boldsymbol x + proj_W \boldsymbol y \\&=\boldsymbol u +\boldsymbol v\end{aligned} projWz=UUTz=UUT(x+y)=UUTx+UUTy=projWx+projWy=u+v

目录

A Geometric Interpretation of the Orthogonal Projection

Properties of Orthogonal Projections