2.5 Matrix factorizations (矩阵因式分解)

本文为《Linear algebra and its applications》的读书笔记

A $factorization (因式分解)$ of a matrix $A$ is an equation that expresses $A$ as a product of two or more matrices. Whereas matrix multiplication involves a $synthesis$ of data (数据的综合) (combining the effects of two or more linear transformations into a single matrix). In the language of computer science, the expression of $A$ as a product amounts to a preprocessing of the data in $A$ , organizing that data into two or more parts whose structures are more useful in some way, perhaps more accessible for computation(便于计算).

The LU Factorization LU分解

The LU factorization is motivated by the fairly common industrial and business problem of solving a sequence of equations, all with the same coefficient matrix:
$A\boldsymbol x = \boldsymbol b_1,A\boldsymbol x = \boldsymbol b_2,...,A\boldsymbol x = \boldsymbol b_p \ \ \ \ \ (1)$ When $A$ is invertible, one could compute $A^{-1}$ and then compute $A^{-1}\boldsymbol b_1$ , $A^{-1}\boldsymbol b_2$ , and so on. However, it is more efficient to solve the first equation in sequence (1) by row reduction and obtain an LU factorization of $A$ at the same time. Thereafter, the remaining equations in sequence (1) are solved with the LU factorization.

At first, assume that $A$ is an $m \times n$ matrix that can be row reduced to echelon form, $without\ row\ interchanges$ . (Later, we will treat the general case.) Then $A$ can be written in the form $A = LU$ , where $L$ is an $m \times m$ lower triangular matrix with 1’s on the diagonal and $U$ is an $m \times n$ echelon form of $A$ . For instance, see Figure 1. Such a factorization is called an $LU$ factorization of $A$ . The matrix $L$ is invertible and is called a $unit$ lower triangular matrix(单位下三角矩阵).

2.5 Matrix factorizations (矩阵因式分解)
Before studying how to construct $L$ and $U$ , we should look at why they are so useful. When $A = LU$ , the equation $A\boldsymbol x = \boldsymbol b$ can be written as $L(U\boldsymbol x)=\boldsymbol b$ . Writing $\boldsymbol y$ for $U\boldsymbol x$ , we can find $\boldsymbol x$ by solving the pair of equations
2.5 Matrix factorizations (矩阵因式分解)
First solve $L\boldsymbol y = \boldsymbol b$ for $\boldsymbol y$ ; and then solve $U\boldsymbol x =\boldsymbol y$ for $\boldsymbol x$ . See Figure 2. Each equation is easy to solve because $L$ and $U$ are triangular.

EXAMPLE 1
It can be verified that
2.5 Matrix factorizations (矩阵因式分解)
Use this LU factorization of $A$ to solve $A\boldsymbol x = \boldsymbol b$ , where $\boldsymbol b =\begin{bmatrix} -9\\5\\7\\11\end{bmatrix}$ .
SOLUTION The solution of $L\boldsymbol y = \boldsymbol b$ needs only 6 multiplications and 6 additions.
2.5 Matrix factorizations (矩阵因式分解)
Then, for $U\boldsymbol x = \boldsymbol y$ , the “backward” phase of row reduction requires 4 divisions, 6 multiplications,
and 6 additions. (For instance, creating the zeros in column 4 requires 1 division in row 4 and 3 multiplication–addition pairs to add multiples of row 4 to the rows above.)
2.5 Matrix factorizations (矩阵因式分解)
To find $\boldsymbol x$ requires 28 arithmetic operations, or “flops” (floating point operations), excluding the cost of finding $L$ and $U$ . In contrast, row reduction of $[ \ \ A\ \ \ \boldsymbol b\ \ ]$ to $[ \ \ I\ \ \ \boldsymbol x\ \ ]$ takes 62 operations.

Checkpoint: Section 2.2 shows how to compute $A^{–1}B$ by row reduction. Describe how you could speed up this calculation if you have an LU factorization of $A$ available (and $A$ is invertible).
Answer: If $A$ is an invertible $n\times n$ matrix, with an LU factorization $A = LU$ , and if $B$ is $n\times p$ , then $A^{–1}B$ can be computed by first row reducing $[\ \ L\ \ \ B\ \ ]$ to a matrix $[\ \ I\ \ \ Y\ \ ]$ for some $Y$ and then reducing $[\ \ \ U\ \ Y\ \ \ ]$ to $[\ \ I\ \ \ A^{–1}B\ \ ]$ . MATLAB uses this approach to compute $A^{–1}B$ (after first finding L and U).

EXAMPLE
When $A$ is invertible, MATLAB finds $A^{-1}$ by factoring $A =LU$ (where $L$ may be permuted lower triangular), inverting $L$ and $U$ , and then computing $U^{-1}L^{-1}$ .

An LU Factorization Algorithm

Suppose $A$ can be reduced to an echelon form $U$ using only row replacements that add a multiple of one row to another row below it. In this case, there exist unit lower triangular elementary matrices $E_1,...,E_p$ such that
$E_p...E_1A=U\ \ \ \ \ \ (3)$ Then
$A=(E_p...E_1)^{-1}U=LU$ It can be shown that products and inverses of unit lower triangular matrices are also unit lower triangular. Thus $L$ is unit lower triangular.

PROOF
Let $A$ be a lower triangular $n \times n$ matrix with nonzero entries on the diagonal. Show that $A$ is invertible and $A^{-1}$ is lower triangular.
[Hint: Explain why $A$ can be changed into $I$ using only row replacements and scaling. Also, explain why the row operations that reduce $A$ to $I$ change $I$ into a lower triangular matrix.] 也可以用上一节介绍的分块矩阵+数学归纳法来证明

Note that the row operations in equation (3), which reduce $A$ to $U$ , also reduce the $L$ in equation (4) to $I$ . This observation is the key to constructing $L$ .
2.5 Matrix factorizations (矩阵因式分解)
Step 1 is not always possible, but when it is, the argument above shows that an LU factorization exists.

EXAMPLE 2
Find an LU factorization of
2.5 Matrix factorizations (矩阵因式分解)
SOLUTION
Since $A$ has four rows, $L$ should be $4 \times 4$ . The first column of $L$ is the first column of $A$ divided by the top pivot entry:

Compare the first columns of $A$ and $L$ . The row operations that create zeros in the first column of $A$ will also create zeros in the first column of $L$ . To make this same correspondence of row operations on $A$ hold for the rest of $L$ , watch a row reduction of $A$ to an echelon form $U$ . That is, highlight the entries in each matrix that are used to determine the sequence of row operations that transform $A$ into $U$ .
2.5 Matrix factorizations (矩阵因式分解)
The highlighted entries determine the row reduction of $A$ to $U$ . At each pivot column, divide the highlighted entries by the pivot and place the result into $L$ :

In practical work, row interchanges are nearly always needed, because partial pivoting(部分主元法) is used for high accuracy. (Recall that this procedure selects, among the possible choices for a pivot, an entry in the column having the largest absolute value.)

To handle row interchanges, the LU factorization above can be modified easily to produce an $L$ that is $permuted\ lower\ triangular$ (置换下三角矩阵), in the sense that a rearrangement (called a permutation) of the rows of $L$ can make $L$ (unit) lower triangular.
The resulting $permuted\ LU\ factorization$ (置换LU分解) solves $A\boldsymbol x = \boldsymbol b$ in the same way as before, except that the reduction of $[\ \ L\ \ \ \boldsymbol b\ \ ]$ to $[\ \ I\ \ \ \boldsymbol y\ \ ]$ follows the order of the pivots in $L$ from left to right, starting with the pivot in the first column. A reference to an “LU factorization” usually includes the possibility that $L$ might be permuted lower triangular. For details, see the Appendix.

2.5 Matrix factorizations (矩阵因式分解)

A Matrix Factorization in Electrical Engineering

Matrix factorization is intimately related to the problem of constructing an electrical network with specified properties.

Suppose the box in Figure 3 represents some sort of electric circuit, with an input and output.
2.5 Matrix factorizations (矩阵因式分解)
Record the input voltage and current by $\begin{bmatrix}v_1\\i_1\end{bmatrix}$ and record the output voltage and current by $\begin{bmatrix}v_2\\i_2\end{bmatrix}$ . Frequently, the transformation $\begin{bmatrix}v_1\\i_1\end{bmatrix}\mapsto\begin{bmatrix}v_2\\i_2\end{bmatrix}$ is linear. That is, there is a matrix $A$ , called the $transfer\ matrix(传递矩阵)$ , such that
$\begin{bmatrix}v_2\\i_2\end{bmatrix}=A\begin{bmatrix}v_1\\i_1\end{bmatrix}$ Figure 4 shows a $ladder\ network$ . The left circuit in Figure 4 is called a $series\ circuit(串联电路)$ , with resistance $R_1$ .
2.5 Matrix factorizations (矩阵因式分解)
The right circuit in Figure 4 is a $shunt\ circuit(并联电路)$ , with resistance $R_2$ . Using Ohm’s law and Kirchhoff’s laws, one can show that the transfer matrices of the series and shunt circuits, respectively, are

EXAMPLE 3
a. Compute the transfer matrix of the ladder network in Figure 4.
b. Design a ladder network whose transfer matrix is $\begin{bmatrix} 1&-8\\-.5&5 \end{bmatrix}$ .
SOLUTION
a. Let $A_1$ and $A_2$ be the transfer matrices of the series and shunt circuits, respectively. Then the transfer matrix of the ladder network corresponds to composition of linear transformations
2.5 Matrix factorizations (矩阵因式分解)
b. To factor the matrix $\begin{bmatrix} 1&-8\\-.5&5 \end{bmatrix}$ into the product of transfer matrices, as in equation
(6), look for $R_1$ and $R_2$ in Figure 4 to satisfy

A network transfer matrix summarizes the input–output behavior (the design specifications) of the network without reference to the interior circuits. To physically build a network with specified properties, an engineer first determines if such a network can be constructed (or realized). Then the engineer tries to factor the transfer matrix into matrices corresponding to smaller circuits that perhaps are already manufactured and ready for assembly. In the common case of alternating current(交流电), the entries in the transfer matrix are usually rational complex-valued functions. A standard problem is to find a minimal realization that uses the smallest number of electrical components.

The QR Factorization QR分解

Suppose $A = QR$ ; where $Q$ and $R$ are $n \times n$ . $R$ is invertible and upper triangular, and $Q$ has the property that $Q^TQ = I$ . It can be shown that for each $\boldsymbol b$ in $\mathbb R^n$ , the equation $A\boldsymbol x =\boldsymbol b$ has a unique solution.

When $A =QR$ , the equation $A\boldsymbol x = \boldsymbol b$ can be written as $Q(R\boldsymbol x)=\boldsymbol b$ . Then $R\boldsymbol x=Q^T\boldsymbol b$ , which is easy to solve.

Appendix: Permuted LU Factorizations 置换LU分解

Any $m\times n$ matrix $A$ admits a factorization $A = LU$ , with $U$ in echelon form and $L$ a permuted unit lower triangular matrix. That is, $L$ is a matrix such that a permutation (rearrangement) of its rows (using row interchanges) will produce a lower triangular matrix with 1’s on the diagonal.

The construction of $L$ and $U$ , illustrated below, depends on first using row replacements to reduce $A$ to a permuted echelon form $V$ and then using row interchanges to reduce $V$ to an echelon form $U$ . By watching the reduction of $A$ to $V$ , we can easily construct a permuted unit lower triangular matrix $L$ with the property that the sequence of operations changing $A$ into $U$ also changes $L$ into $I$ . This property will guarantee that $A = LU$ .

The following algorithm reduces any matrix to a permuted echelon form. In the algorithm when a row is covered, we ignore it in later calculations.

Begin with the leftmost nonzero column. Choose any nonzero entry as the pivot. Designate the corresponding row as a pivot row.
Use row replacements to create zeros above and below the pivot (in all uncovered rows). Then cover that pivot row.
Repeat steps 1 and 2 on the uncovered submatrix, if any, until all nonzero entries are
covered.

As an example, choose any entry in the first column of the following matrix as the first pivot, and use the pivot to create zeros in the rest of column 1. We choose the (3, 1)-entry.
2.5 Matrix factorizations (矩阵因式分解)
Row 3 is the first pivot row. Choose the (2, 2)-entry as the second pivot, and create zeros in the rest of column 2, excluding the first pivot row.

Cover row 2 and choose the (4, 4)-entry as the pivot. Create zeros in the other rows, excluding the
first two pivot rows.
2.5 Matrix factorizations (矩阵因式分解)
Let $V$ denote this permuted echelon form, and permute the rows of $V$ to create an echelon form. The resulting echelon matrix $U$ is

The last step is to create $L$ . Go back and watch the reduction of $A$ to $V$ . As each pivot is selected, take the pivot column, and divide the pivot into each entry in the column that is not yet in a pivot row. Place the resulting column into $L$ . At the end, fill the holes in $L$ with zeros.
2.5 Matrix factorizations (矩阵因式分解)
The next example illustrates what to do when $V$ has one or more rows of zeros. For the reduction of $A$ to $V$ , pivots were chosen to have the largest possible magnitude (the choice used for “partial pivoting”).

The first three columns of $L$ come from the three pivot columns above.
2.5 Matrix factorizations (矩阵因式分解)
The matrix $L$ needs two more columns. Use columns 1 and 3 of the $5\times5$ identity matrix to place 1’s in the “nonpivot” rows 1 and 3. Fill in the remaining holes with zeros.

目录

The LU Factorization LU分解

An LU Factorization Algorithm

A Matrix Factorization in Electrical Engineering

The QR Factorization QR分解

Appendix: Permuted LU Factorizations 置换LU分解