Chapter 6
Direct Methods for Solving Linear Systems
Per-Olof Persson
persson@berkeley.edu
Department of Mathematics
University of California, Berkeley
Math 128A Numerical Analysis
Consider solving a linear system of the form: \[ \begin{aligned} E_1: a_{11}x_1 + a_{12}x_2+\cdots+a_{1n}x_n &= b_1, \\ E_2: a_{21}x_1 + a_{22}x_2+\cdots+a_{2n}x_n &= b_2, \\ & \ \,\vdots \\ E_n: a_{n1}x_1 + a_{n2}x_2+\cdots+a_{nn}x_n &= b_n, \end{aligned} \] for \(x_1,\ldots,x_n\). Direct methods give an answer in a fixed number of steps, subject only to round-off errors.
We use three row operations to simplify the linear system:
An \(n\times m\) matrix is a rectangular array of elements with \(n\) rows and \(m\) columns in which both value and position of an element is important
Multiplications/divisions: \[ \begin{aligned} \sum_{i=1}^{n-1} (n-i)(n-i+2) = \cdots = \frac{2n^3 + 3n^2 -5n}{6} \end{aligned} \] Additions/subtractions: \[ \begin{aligned} \sum_{i=1}^{n-1} (n-i)(n-i+1) = \cdots = \frac{n^3 - n}{3} \end{aligned} \]
Multiplications/divisions: \[ \begin{aligned} 1+\sum_{i=1}^{n-1} ((n-i)+1) = \frac{n^2+n}{2} \end{aligned} \] Additions/subtractions: \[ \begin{aligned} \sum_{i=1}^{n-1} ((n-i-1)+1) = \frac{n^2-n}{2} \end{aligned} \]
Multiplications/divisions: \[ \begin{aligned} \frac{n^3}{3}+n^2-\frac{n}{3} \end{aligned} \] Additions/subtractions: \[ \begin{aligned} \frac{n^3}{3}+\frac{n^2}{2}-\frac{5n}{6} \end{aligned} \]
Two matrices \(A\) and \(B\) are equal if they have the same number of rows and columns \(n\times m\) and if \(a_{ij}=b_{ij}\).
If \(A\) and \(B\) are \(n\times m\) matrices, the sum \(A+B\) is the \(n\times m\) matrix with entries \(a_{ij}+b_{ij}\).
If \(A\) is \(n\times m\) and \(\lambda\) a real number, the scalar multiplication \(\lambda A\) is the \(n \times m\) matrix with entries \(\lambda a_{ij}\).
Let \(A,B,C\) be \(n\times m\) matrices, \(\lambda,\mu\) real numbers.
Let \(A\) be \(n\times m\) and \(B\) be \(m\times p\). The matrix product \(C=AB\) is the \(n\times p\) matrix with entries \[ \begin{aligned} c_{ij} = \sum_{k=1}^m a_{ik}b_{kj} = a_{i1}b_{1j} + a_{i2}b_{2j}+\cdots + a_{im}b_{mj} \end{aligned} \]
Let \(A\) be \(n\times m\), \(B\) be \(m\times k\), \(C\) be \(k\times p\), \(D\) be \(m\times k\), and \(\lambda\) a real number.
For any nonsingular \(n\times n\) matrix \(A\),
The following statements are equivalent for any \(n\times n\) matrix \(A\):
The \(k\)th Gaussian transformation matrix is defined by \[ \begin{aligned} M^{(k)} = \begin{bmatrix} 1 & 0 & & \cdots & & \cdots & & 0 \\ 0 & \ddots & \ddots & & & & & \vdots \\ \vdots & \ddots & \ddots & \ddots & & & & \vdots \\ \vdots & & 0 & \ddots & \ddots & & & \vdots \\ \vdots & & \vdots & -m_{k+1,k} & \ddots & \ddots & & \vdots \\ \vdots & & \vdots & \vdots & 0 & \ddots & & \vdots \\ \vdots & & \vdots & \vdots & \vdots & \ddots & \ddots & 0\\ 0 & \cdots & 0 & -m_{n,k} & 0 & \cdots & 0 & 1\\ \end{bmatrix} \end{aligned} \]
Gaussian elimination can be written as \[ \begin{aligned} A^{(n)} = M^{(n-1)}\cdots M^{(1)}A = \begin{bmatrix} a_{11}^{(1)} & a_{12}^{(1)} & \cdots & a_{1n}^{(1)} \\ 0 & a_{22}^{(2)} & \ddots & \vdots \\ \vdots & \ddots & \ddots & a_{n-1,n}^{(n-1)} \\ 0 & \cdots & 0 & a_{nn}^{(n)} \end{bmatrix} \end{aligned} \]
Reversing the elimination steps gives the inverses: \[ \begin{aligned} L^{(k)} = [M^{(k)}]^{-1} = \begin{bmatrix} 1 & 0 & & \cdots & & \cdots & & 0 \\ 0 & \ddots & \ddots & & & & & \vdots \\ \vdots & \ddots & \ddots & \ddots & & & & \vdots \\ \vdots & & 0 & \ddots & \ddots & & & \vdots \\ \vdots & & \vdots & m_{k+1,k} & \ddots & \ddots & & \vdots \\ \vdots & & \vdots & \vdots & 0 & \ddots & & \vdots \\ \vdots & & \vdots & \vdots & \vdots & \ddots & \ddots & 0\\ 0 & \cdots & 0 & m_{n,k} & 0 & \cdots & 0 & 1\\ \end{bmatrix} \end{aligned} \] and we have \[ \begin{aligned} LU&=L^{(1)}\cdots L^{(n-1)}\cdots M^{(n-1)}\cdots M^{(1)} A \\ &= [M^{(1)}]^{-1}\cdots [M^{(n-1)}]^{-1}\cdots M^{(n-1)} \cdots M^{(1)} A = A \end{aligned} \]
If Gaussian elimination can be performed on the linear system \(A\mathbf{x}=\mathbf{b}\) without row interchanges, \(A\) can be factored into the product of lower-triangular \(L\) and upper-triangular \(U\) as \(A=LU\), where \(m_{ji}=a_{ji}^{(i)}/a_{ii}^{(i)}\): \[ \begin{aligned} U= \begin{bmatrix} a_{11}^{(1)} & a_{12}^{(1)} & \cdots & a_{1n}^{(1)} \\ 0 & a_{22}^{(2)} & \ddots & \vdots \\ \vdots & \ddots & \ddots & a_{n-1,n}^{(n-1)} \\ 0 & \cdots & 0 & a_{nn}^{(n)} \end{bmatrix},\ L= \begin{bmatrix} 1 & 0 & \cdots & 0 \\ m_{21} & 1 & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ m_{n1} & \cdots & m_{n,n-1} & 1 \end{bmatrix} \end{aligned} \]
Suppose \(k_1,\ldots,k_n\) is a permutation of \(1,\ldots,n\). The permutation matrix \(P=(p_{ij})\) is defined by \[ \begin{aligned} p_{ij} = \begin{cases} 1, & \text{if } j=k_i,\\ 0, & \text{otherwise.} \end{cases} \end{aligned} \]
Gaussian elimination with row interchanges then becomes: \[ \begin{aligned} A=P^{-1}LU = (P^t L ) U \end{aligned} \]
The \(n\times n\) matrix \(A\) is said to be strictly diagonally dominant when \[ \begin{aligned} |a_{ii}|>\sum_{j\ne i} |a_{ij}| \end{aligned} \]
A strictly diagonally dominant matrix \(A\) is nonsingular, Gaussian elimination can be performed on \(A\mathbf{x}=\mathbf{b}\) without row interchanges, and the computations will be stable.
A matrix \(A\) is positive definite if it is symmetric and if \(\mathbf{x}^t A \mathbf{x} > 0\) for every \(\mathbf{x}\ne \mathbf{0}\).
If \(A\) is an \(n\times n\) positive definite matrix, then
A leading principal submatrix of a matrix \(A\) is a matrix of the form \[ \begin{aligned} A_k = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1k} \\ a_{21} & a_{22} & \cdots & a_{2k} \\ \vdots & \vdots & & \vdots \\ a_{k1} & a_{k2} & \cdots & a_{kk} \end{bmatrix} \end{aligned} \] for some \(1\le k \le n\).
A symmetric matrix \(A\) is positive definite if and only if each of its leading principal submatrices has a positive determinant.
The symmetric matrix \(A\) is positive definite if and only if Gaussian elimination without row interchanges can be done on \(A\mathbf{x}=\mathbf{b}\) with all pivot elements positive, and the computations are then stable.
The matrix \(A\) is positive definite if and only if it can be factored \(A=LDL^t\) where \(L\) is lower triangular with \(1\)’s on its diagonal and \(D\) is diagonal with positive diagonal entries.
The matrix \(A\) is positive definite if and only if it can be factored \(A=LL^t\), where \(L\) is lower triangular with nonzero diagonal entries.
An \(n\times n\) matrix is called a band matrix if \(p,q\) exist with \(1<p,q<n\) and \(a_{ij}=0\) when \(p\le j-i\) or \(q\le i-j\). The bandwidth is \(w=p+q-1\).
A tridiagonal matrix has \(p=q=2\) and bandwidth \(3\).
Suppose \(A=[a_{ij}]\) is tridiagonal with \(a_{i,i-1}a_{i,i+1}\ne 0\). If \(|a_{11}|>|a_{12}|\), \(|a_{ii}|\ge |a_{i,i-1}|+|a_{i,i+1}|\), and \(|a_{nn}|>|a_{n,n-1}|\), then \(A\) is nonsingular.