The orthogonal projection of a point in R 2 \mathbb R^2 R2 onto a line through the origin has an important analogue in R n \mathbb R^n Rn.
Given a vector y \boldsymbol y y and a subspace W W W in R n \mathbb R^n Rn, there is a vector y ^ \hat \boldsymbol y y^ in W W W such that (1) y ^ \hat \boldsymbol y y^ is the unique vector in W W W for which y − y ^ \boldsymbol y -\hat\boldsymbol y y−y^ is orthogonal to W W W , and (2) y ^ \hat\boldsymbol y y^ is the unique vector in W W W closest to y \boldsymbol y y. See Figure 1.
EXAMPLE 1
Let
{
u
1
,
.
.
.
,
u
5
}
\{\boldsymbol u_1,...,\boldsymbol u_5\}
{u1,...,u5} be an orthogonal basis for
R
5
\mathbb R^5
R5 and let
Consider the subspace
W
=
S
p
a
n
{
u
1
,
u
2
}
W = Span \{\boldsymbol u_1,\boldsymbol u_2\}
W=Span{u1,u2}, and write
y
\boldsymbol y
y as the sum of a vector
z
1
\boldsymbol z_1
z1 in
W
W
W and a vector
z
2
\boldsymbol z_2
z2 in
W
⊥
W^\perp
W⊥.
SOLUTION
The next theorem shows that the decomposition y = z 1 + z 2 \boldsymbol y =\boldsymbol z_1 +\boldsymbol z_2 y=z1+z2 in Example 1 can be computed without having an orthogonal basis for R n \mathbb R^n Rn. It is enough to have an orthogonal basis only for W W W .
The vector
y
^
\hat \boldsymbol y
y^ in (1) is called the orthogonal projection of
y
\boldsymbol y
y onto
W
W
W and often is written as
p
r
o
j
W
y
proj_W\boldsymbol y
projWy. See Figure 2. When W is a one-dimensional subspace, the formula for
y
^
\hat\boldsymbol y
y^ matches the formula given in Section 6.2.
PROOF
We may assume that W W W is not the zero subspace, for otherwise W ⊥ = R n W^\perp=\mathbb R^n W⊥=Rn and (1) is simply y = 0 + y \boldsymbol y =\boldsymbol 0+\boldsymbol y y=0+y. The next section will show that any nonzero subspace of R n \mathbb R^n Rn has an orthogonal basis.
Let { u 1 , . . . , u p } \{\boldsymbol u_1,...,\boldsymbol u_p\} {u1,...,up} be any orthogonal basis for W W W , and define y ^ \hat\boldsymbol y y^ by (2). Let z = y − y ^ \boldsymbol z =\boldsymbol y -\hat\boldsymbol y z=y−y^, then
Thus
z
\boldsymbol z
z is orthogonal to
u
1
\boldsymbol u_1
u1. Similarly,
z
\boldsymbol z
z is orthogonal to each
u
j
\boldsymbol u_j
uj in the basis for
W
W
W. Hence
z
\boldsymbol z
z is orthogonal to every vector in
W
W
W . That is,
z
\boldsymbol z
z is in
W
⊥
W^\perp
W⊥.
To show that the decomposition in (1) is unique, suppose
y
\boldsymbol y
y can also be written as
y
=
y
^
1
+
z
1
\boldsymbol y =\hat\boldsymbol y_1 +\boldsymbol z_1
y=y^1+z1, with
y
^
1
\hat\boldsymbol y_1
y^1 in
W
W
W and
z
1
\boldsymbol z_1
z1 in
W
⊥
W^\perp
W⊥. Then
y
^
+
z
=
y
^
1
+
z
1
\hat\boldsymbol y+\boldsymbol z =\hat\boldsymbol y_1+\boldsymbol z_1
y^+z=y^1+z1, and so
y
^
−
y
^
1
=
z
1
−
z
\hat\boldsymbol y-\hat\boldsymbol y _1=\boldsymbol z_1-\boldsymbol z
y^−y^1=z1−z
This equality shows that the vector v = y ^ − y ^ 1 \boldsymbol v =\hat\boldsymbol y-\hat\boldsymbol y_1 v=y^−y^1 is in W W W and in W ⊥ W^\perp W⊥. Hence v ⋅ v = 0 \boldsymbol v\cdot \boldsymbol v = 0 v⋅v=0, which shows that v = 0 \boldsymbol v =\boldsymbol 0 v=0. This proves that y ^ = y ^ 1 \hat\boldsymbol y=\hat\boldsymbol y_1 y^=y^1 and also z 1 = z \boldsymbol z_1 =\boldsymbol z z1=z.
EXERCISES
Suppose that
{
u
1
,
u
2
}
\{\boldsymbol u_1, \boldsymbol u_2\}
{u1,u2} is an orthogonal set of nonzero vectors in
R
3
\mathbb R^3
R3. How would you find an orthogonal basis of
R
3
\mathbb R^3
R3 that contains
u
1
\boldsymbol u_1
u1 and
u
2
\boldsymbol u_2
u2?
SOLUTION
First, find a vector
v
\boldsymbol v
v in
R
3
\mathbb R^3
R3 that is not in the subspace
W
W
W spanned by
u
1
\boldsymbol u_1
u1 and
u
2
\boldsymbol u_2
u2. Let
u
3
=
v
−
p
r
o
j
W
v
\boldsymbol u_3=\boldsymbol v-proj_W\boldsymbol v
u3=v−projWv, then
{
u
1
,
u
2
,
u
3
}
\{\boldsymbol u_1, \boldsymbol u_2, \boldsymbol u_3\}
{u1,u2,u3} is an orthogonal basis.
EXERCISES 23
Let
A
A
A be an
m
×
n
m \times n
m×n matrix. Prove that every vector
x
\boldsymbol x
x in
R
n
\mathbb R^n
Rn can be written in the form
x
=
p
+
u
\boldsymbol x=\boldsymbol p +\boldsymbol u
x=p+u, where
p
\boldsymbol p
p is in
R
o
w
A
RowA
RowA and
u
\boldsymbol u
u is in
N
u
l
A
NulA
NulA. Also, show that if the equation
A
x
=
b
A\boldsymbol x =\boldsymbol b
Ax=b is consistent, then there is a unique
p
\boldsymbol p
p in
R
o
w
A
RowA
RowA such that
A
p
=
b
A\boldsymbol p=\boldsymbol b
Ap=b.
SOLUTION
By the Orthogonal Decomposition Theorem, each
x
\boldsymbol x
x in
R
n
\mathbb R^n
Rn can be written uniquely as
x
=
p
+
u
\boldsymbol x = \boldsymbol p + \boldsymbol u
x=p+u, with
p
\boldsymbol p
p in
R
o
w
A
Row A
RowA and
u
\boldsymbol u
u in
(
R
o
w
A
)
⊥
(Row A)^\perp
(RowA)⊥. By Theorem 3 in Section 6.1,
u
\boldsymbol u
u is in
N
u
l
A
Nul A
NulA.
Next, suppose that
A
x
=
b
A\boldsymbol x = \boldsymbol b
Ax=b is consistent. Let
x
\boldsymbol x
x be a solution, and write
x
=
p
+
u
\boldsymbol x = \boldsymbol p +\boldsymbol u
x=p+u, as above. Then
A
p
=
A
(
x
–
u
)
=
A
x
–
A
u
=
b
–
0
=
b
A\boldsymbol p = A(\boldsymbol x – \boldsymbol u) = A\boldsymbol x – A\boldsymbol u = \boldsymbol b – \boldsymbol 0 = \boldsymbol b
Ap=A(x–u)=Ax–Au=b–0=b. So the equation
A
x
=
b
A\boldsymbol x = \boldsymbol b
Ax=b has at least one solution
p
\boldsymbol p
p in
R
o
w
A
Row A
RowA.
Finally, suppose that
p
\boldsymbol p
p and
p
1
\boldsymbol p_1
p1 are both in
R
o
w
A
Row A
RowA and satisfy
A
x
=
b
A\boldsymbol x = \boldsymbol b
Ax=b. Then
p
–
p
1
\boldsymbol p – \boldsymbol p_1
p–p1 is in
N
u
l
A
Nul A
NulA because
A
(
p
–
p
1
)
=
A
p
–
A
p
1
=
b
–
b
=
0
A (\boldsymbol p – \boldsymbol p_1) = A\boldsymbol p – A\boldsymbol p_1 = \boldsymbol b – \boldsymbol b = \boldsymbol 0
A(p–p1)=Ap–Ap1=b–b=0
The equations p = p 1 + ( p – p 1 ) \boldsymbol p = \boldsymbol p_1 + (\boldsymbol p – \boldsymbol p_1) p=p1+(p–p1) and p = p + 0 \boldsymbol p = \boldsymbol p + \boldsymbol 0 p=p+0 both decompose p \boldsymbol p p as the sum of a vector in R o w A Row A RowA and a vector in ( R o w A ) T (Row A)^T (RowA)T. By the uniqueness of the orthogonal decomposition (Theorem 8), p 1 = p \boldsymbol p_1 = \boldsymbol p p1=p, so p \boldsymbol p p is unique.
A Geometric Interpretation of the Orthogonal Projection
When
W
W
W is a one-dimensional subspace, the formula (2) for
p
r
o
j
W
y
proj_W \boldsymbol y
projWy contains just one term. Thus, when
d
i
m
W
>
1
dimW > 1
dimW>1, each term in (2) is itself an orthogonal projection of
y
\boldsymbol y
y onto a one-dimensional subspace spanned by one of the
u
\boldsymbol u
u’s in the basis for
W
W
W . Figure 3
illustrates this when
W
W
W is a subspace of
R
3
\mathbb R^3
R3 spanned by
u
1
\boldsymbol u_1
u1 and
u
2
\boldsymbol u_2
u2.
Properties of Orthogonal Projections
This fact also follows from the next theorem.
最佳逼近定理
The vector y \boldsymbol y y in Theorem 9 is called the best approximation to y \boldsymbol y y by elements of W W W( W W W 中元素对 y \boldsymbol y y 的最佳逼近).
Later sections in the text will examine problems where a given y \boldsymbol y y must be replaced, or approximated, by a vector v \boldsymbol v v in some fixed subspace W W W . The distance ∥ y − v ∥ \left\|\boldsymbol y-\boldsymbol v\right\| ∥y−v∥, can be regarded as the “error” of using v \boldsymbol v v in place of y \boldsymbol y y. Theorem 9 says that this error is minimized when v = y ^ \boldsymbol v =\hat\boldsymbol y v=y^.
Inequality (3) leads to a new proof that y ^ \hat\boldsymbol y y^ does not depend on the particular orthogonal basis used to compute it.
PROOF
Take
v
\boldsymbol v
v in
W
W
W distinct from
y
^
\hat\boldsymbol y
y^. See Figure 4. Then
y
−
y
^
\boldsymbol y -\hat \boldsymbol y
y−y^ is orthogonal to
y
^
−
v
\hat\boldsymbol y-\boldsymbol v
y^−v (which is in
W
W
W ). Since
the Pythagorean Theorem(勾股定理) gives
Now
∥
y
^
−
v
∥
>
0
\left\|\hat\boldsymbol y -\boldsymbol v\right\| > 0
∥y^−v∥>0 , and so inequality (3) follows immediately.
The final theorem in this section shows how formula (2) for p r o j W y proj_W \boldsymbol y projWy is simplified when the basis for W W W is an orthonormal set.
Suppose U U U is an n × p n \times p n×p matrix with orthonormal columns, and let W W W be the column space of U U U. Then
EXAMPLE
Let
W
W
W be a subspace of
R
n
\mathbb R^n
Rn. Let
x
\boldsymbol x
x and
y
\boldsymbol y
y be vectors in
R
n
\mathbb R^n
Rn and let
z
=
x
+
y
\boldsymbol z =\boldsymbol x + \boldsymbol y
z=x+y. If
u
\boldsymbol u
u is the projection of
x
\boldsymbol x
x onto
W
W
W and
v
\boldsymbol v
v is the projection of
y
\boldsymbol y
y onto
W
W
W , show that
u
+
v
\boldsymbol u + \boldsymbol v
u+v is the projection of
z
\boldsymbol z
z onto
W
W
W .
SOLUTION
Let
U
U
U be a matrix whose columns consist of an orthonormal basis for
W
W
W . Then
p
r
o
j
W
z
=
U
U
T
z
=
U
U
T
(
x
+
y
)
=
U
U
T
x
+
U
U
T
y
=
p
r
o
j
W
x
+
p
r
o
j
W
y
=
u
+
v
\begin{aligned}proj_W\boldsymbol z &= UU^T\boldsymbol z \\&= UU^T (\boldsymbol x + \boldsymbol y)\\&= UU^T \boldsymbol x + UU^T \boldsymbol y \\&= proj_W \boldsymbol x + proj_W \boldsymbol y \\&=\boldsymbol u +\boldsymbol v\end{aligned}
projWz=UUTz=UUT(x+y)=UUTx+UUTy=projWx+projWy=u+v