Chapter 6
Direct Methods for Solving Linear Systems

Per-Olof Persson
persson@berkeley.edu

Department of Mathematics
University of California, Berkeley

Math 128A Numerical Analysis

Direct Methods for Linear Systems

Consider solving a linear system of the form: \[ \begin{aligned} E_1: a_{11}x_1 + a_{12}x_2+\cdots+a_{1n}x_n &= b_1, \\ E_2: a_{21}x_1 + a_{22}x_2+\cdots+a_{2n}x_n &= b_2, \\ & \ \,\vdots \\ E_n: a_{n1}x_1 + a_{n2}x_2+\cdots+a_{nn}x_n &= b_n, \end{aligned} \] for \(x_1,\ldots,x_n\). Direct methods give an answer in a fixed number of steps, subject only to round-off errors.

We use three row operations to simplify the linear system:

Multiply Eq. \(E_i\) by \(\lambda\ne 0\): \((\lambda E_i)\rightarrow (E_i)\)
Multiply Eq. \(E_j\) by \(\lambda\) and add to Eq. \(E_i\): \((E_i + \lambda E_j) \rightarrow (E_i)\)
Exchange Eq. \(E_i\) and Eq. \(E_j\): \((E_i) \leftrightarrow (E_j)\)

Gaussian Elimination

Gaussian Elimination with Backward Substitution

Reduce a linear system to triangular form by introducing zeros using the row operations \((E_i + \lambda E_j) \rightarrow (E_i)\)
Solve the triangular form using backward-substitution

Row Exchanges

If a pivot element on the diagonal is zero, the reduction to triangular form fails
Find a nonzero element below the diagonal and exchange the two rows

Definition

An \(n\times m\) matrix is a rectangular array of elements with \(n\) rows and \(m\) columns in which both value and position of an element is important

Operation Counts

Count the number of arithmetic operations performed
Use the formulas \[ \begin{aligned} \sum_{j=1}^m j = \frac{m(m+1)}{2},\quad \sum_{j=1}^m j^2 = \frac{m(m+1)(2m+1)}{6} \end{aligned} \]

Reduction to Triangular Form

Multiplications/divisions: \[ \begin{aligned} \sum_{i=1}^{n-1} (n-i)(n-i+2) = \cdots = \frac{2n^3 + 3n^2 -5n}{6} \end{aligned} \] Additions/subtractions: \[ \begin{aligned} \sum_{i=1}^{n-1} (n-i)(n-i+1) = \cdots = \frac{n^3 - n}{3} \end{aligned} \]

Operation Counts

Backward Substitution

Multiplications/divisions: \[ \begin{aligned} 1+\sum_{i=1}^{n-1} ((n-i)+1) = \frac{n^2+n}{2} \end{aligned} \] Additions/subtractions: \[ \begin{aligned} \sum_{i=1}^{n-1} ((n-i-1)+1) = \frac{n^2-n}{2} \end{aligned} \]

Operation Counts

Gaussian Elimination Total Operation Count

Multiplications/divisions: \[ \begin{aligned} \frac{n^3}{3}+n^2-\frac{n}{3} \end{aligned} \] Additions/subtractions: \[ \begin{aligned} \frac{n^3}{3}+\frac{n^2}{2}-\frac{5n}{6} \end{aligned} \]

Partial Pivoting

In Gaussian elimination, if a pivot element \(a_{kk}^{(k)}\) is small compared to an element \(a_{jk}^{(k)}\) below, the multiplier \[ \begin{aligned} m_{jk} = \frac{a_{jk}^{(k)}}{a_{kk}^{(k)}} \end{aligned} \] will be large, resulting in round-off errors
Partial pivoting finds the smallest \(p\ge k\) such that \[ \begin{aligned} |a_{pk}^{(k)}| = \max_{k\le i \le n} | a_{ik}^{(k)} | \end{aligned} \] and interchanges the rows \((E_k)\leftrightarrow (E_p)\)

Scaled Partial Pivoting

If there are large variations in magnitude of the elements within a row, scaled partial pivoting can be used
Define a scale factor \(s_i\) for each row \[ \begin{aligned} s_i = \max_{1\le j \le n} |a_{ij}| \end{aligned} \]
At step \(i\), find \(p\) such that \[ \begin{aligned} \frac{|a_{pi}|}{s_p} = \max_{i\le k \le n} \frac{|a_{ki}|}{s_k} \end{aligned} \] and interchange the rows \((E_i)\leftrightarrow (E_p)\)

Linear Algebra

Definition

Two matrices \(A\) and \(B\) are equal if they have the same number of rows and columns \(n\times m\) and if \(a_{ij}=b_{ij}\).

Definition

If \(A\) and \(B\) are \(n\times m\) matrices, the sum \(A+B\) is the \(n\times m\) matrix with entries \(a_{ij}+b_{ij}\).

Definition

If \(A\) is \(n\times m\) and \(\lambda\) a real number, the scalar multiplication \(\lambda A\) is the \(n \times m\) matrix with entries \(\lambda a_{ij}\).

Properties

Theorem

Let \(A,B,C\) be \(n\times m\) matrices, \(\lambda,\mu\) real numbers.

\(A+B=B+A\)
\((A+B)+C=A+(B+C)\)
\(A+0=0+A=A\)
\(A+(-A)=-A+A=0\)
\(\lambda(A+B)=\lambda A + \lambda B\)
\((\lambda+\mu)A=\lambda A + \mu A\)
\(\lambda(\mu A) = (\lambda \mu) A\)
\(1 A = A\)

Matrix Multiplication

Definition

Let \(A\) be \(n\times m\) and \(B\) be \(m\times p\). The matrix product \(C=AB\) is the \(n\times p\) matrix with entries \[ \begin{aligned} c_{ij} = \sum_{k=1}^m a_{ik}b_{kj} = a_{i1}b_{1j} + a_{i2}b_{2j}+\cdots + a_{im}b_{mj} \end{aligned} \]

Special Matrices

Definition

A square matrix has \(m=n\)
A diagonal matrix \(D=[d_{ij}]\) is square with \(d_{ij}=0\) when \(i\ne j\)
The identity matrix of order \(n\), \(I_n=[\delta_{ij}]\), is diagonal with \[ \begin{aligned} \delta_{ij} = \begin{cases} 1, & \text{if } i = j, \\ 0, & \text{if } i \ne j. \end{cases} \end{aligned} \]

Definition

An upper-triangular \(n\times n\) matrix \(U=[u_{ij}]\) has \[ \begin{aligned} u_{ij} = 0,\qquad \text{if } i=j+1,\ldots, n. \end{aligned} \]
A lower-triangular \(n\times n\) matrix \(L=[l_{ij}]\) has \[ \begin{aligned} l_{ij} = 0,\qquad \text{if } i=1,\ldots,j-1. \end{aligned} \]

Properties

Theorem

Let \(A\) be \(n\times m\), \(B\) be \(m\times k\), \(C\) be \(k\times p\), \(D\) be \(m\times k\), and \(\lambda\) a real number.

\(A(BC)=(AB)C\)
\(A(B+D)=AB+AD\)
\(I_mB=B\) and \(BI_k=B\)
\(\lambda(AB)=(\lambda A)B = A(\lambda B)\)

Matrix Inversion

Definition

An \(n\times n\) matrix \(A\) is nonsingular or invertible if \(n\times n\) \(A^{-1}\) exists with \(AA^{-1}=A^{-1}A=I\)
The matrix \(A^{-1}\) is called the inverse of \(A\)
A matrix without an inverse is called singular or noninvertible

Theorem

For any nonsingular \(n\times n\) matrix \(A\),

\(A^{-1}\) is unique
\(A^{-1}\) is nonsingular and \((A^{-1})^{-1}=A\)
If \(B\) is nonsingular \(n\times n\), then \((AB)^{-1}=B^{-1}A^{-1}\)

Matrix Transpose

Definition

The transpose of \(n\times m\) \(A=[a_{ij}]\) is \(m\times n\) \(A^t=[a_{ji}]\)
A square matrix \(A\) is called symmetric if \(A=A^t\)

Theorem

\((A^t)^t = A\)
\((A+B)^t = A^t + B^t\)
\((AB)^t=B^t A^t\)
if \(A^{-1}\) exists, then \((A^{-1})^t = (A^t)^{-1}\)

Determinants

Definition

If \(A=[a]\) is a \(1\times 1\) matrix, then \(\det A=a\)
If \(A\) is \(n\times n\), the minor \(M_{ij}\) is the determinant of the \((n-1)\times (n-1)\) submatrix deleting row \(i\) and column \(j\) of \(A\)
The cofactor \(A_{ij}\) associated with \(M_{ij}\) is \(A_{ij}=(-1)^{i+j}M_{ij}\)
The determinant of \(n\times n\) matrix \(A\) for \(n>1\) is \[ \begin{aligned} \det A = \sum_{j=1}^n a_{ij} A_{ij} =\sum_{j=1}^n (-1)^{i+j} a_{ij}M_{ij} \end{aligned} \] or \[ \begin{aligned} \det A = \sum_{i=1}^n a_{ij} A_{ij} =\sum_{i=1}^n (-1)^{i+j} a_{ij}M_{ij} \end{aligned} \]

Properties

Theorem

If any row or column of \(A\) has all zeros, then \(\det A=0\)
If \(A\) has two rows or two columns equal, then \(\det A=0\)
If \(\tilde{A}\) comes from \((E_i)\leftrightarrow(E_j)\) on \(A\), then \(\det \tilde{A}=-\det A\)
If \(\tilde{A}\) comes from \((\lambda E_i)\leftrightarrow(E_i)\) on \(A\), then \(\det \tilde{A}=\lambda \det A\)
If \(\tilde{A}\) comes from \((E_i+\lambda E_j)\leftrightarrow(E_i)\) on \(A\), with \(i\ne j\), then \(\det \tilde{A}=\det A\)
If \(B\) is also \(n\times n\), then \(\det AB=\det A \det B\)
\(\det A^t = \det A\)
When \(A^{-1}\) exists, \(\det A^{-1} = (\det A)^{-1}\)
If \(A\) is upper/lower triangular or diagonal, then \(\det A = \prod_{i=1}^n a_{ii}\)

Linear Systems and Determinants

Theorem

The following statements are equivalent for any \(n\times n\) matrix \(A\):

The equation \(A\mathbf{x}=\mathbf{0}\) has the unique solution \(\mathbf{x}=\mathbf{0}\)
The system \(A\mathbf{x}=\mathbf{b}\) has a unique solution for any \(\mathbf{b}\)
The matrix \(A\) is nonsingular; that is, \(A^{-1}\) exists
\(\det A\ne 0\)
Gaussian elimination with row interchanges can be performed on the system \(A\mathbf{x}=\mathbf{b}\) for any \(\mathbf{b}\)

LU Factorization

The \(k\)th Gaussian transformation matrix is defined by \[ \begin{aligned} M^{(k)} = \begin{bmatrix} 1 & 0 & & \cdots & & \cdots & & 0 \\ 0 & \ddots & \ddots & & & & & \vdots \\ \vdots & \ddots & \ddots & \ddots & & & & \vdots \\ \vdots & & 0 & \ddots & \ddots & & & \vdots \\ \vdots & & \vdots & -m_{k+1,k} & \ddots & \ddots & & \vdots \\ \vdots & & \vdots & \vdots & 0 & \ddots & & \vdots \\ \vdots & & \vdots & \vdots & \vdots & \ddots & \ddots & 0\\ 0 & \cdots & 0 & -m_{n,k} & 0 & \cdots & 0 & 1\\ \end{bmatrix} \end{aligned} \]

LU Factorization

Gaussian elimination can be written as \[ \begin{aligned} A^{(n)} = M^{(n-1)}\cdots M^{(1)}A = \begin{bmatrix} a_{11}^{(1)} & a_{12}^{(1)} & \cdots & a_{1n}^{(1)} \\ 0 & a_{22}^{(2)} & \ddots & \vdots \\ \vdots & \ddots & \ddots & a_{n-1,n}^{(n-1)} \\ 0 & \cdots & 0 & a_{nn}^{(n)} \end{bmatrix} \end{aligned} \]

LU Factorization

Reversing the elimination steps gives the inverses: \[ \begin{aligned} L^{(k)} = [M^{(k)}]^{-1} = \begin{bmatrix} 1 & 0 & & \cdots & & \cdots & & 0 \\ 0 & \ddots & \ddots & & & & & \vdots \\ \vdots & \ddots & \ddots & \ddots & & & & \vdots \\ \vdots & & 0 & \ddots & \ddots & & & \vdots \\ \vdots & & \vdots & m_{k+1,k} & \ddots & \ddots & & \vdots \\ \vdots & & \vdots & \vdots & 0 & \ddots & & \vdots \\ \vdots & & \vdots & \vdots & \vdots & \ddots & \ddots & 0\\ 0 & \cdots & 0 & m_{n,k} & 0 & \cdots & 0 & 1\\ \end{bmatrix} \end{aligned} \] and we have \[ \begin{aligned} LU&=L^{(1)}\cdots L^{(n-1)}\cdots M^{(n-1)}\cdots M^{(1)} A \\ &= [M^{(1)}]^{-1}\cdots [M^{(n-1)}]^{-1}\cdots M^{(n-1)} \cdots M^{(1)} A = A \end{aligned} \]

LU Factorization

Theorem

If Gaussian elimination can be performed on the linear system \(A\mathbf{x}=\mathbf{b}\) without row interchanges, \(A\) can be factored into the product of lower-triangular \(L\) and upper-triangular \(U\) as \(A=LU\), where \(m_{ji}=a_{ji}^{(i)}/a_{ii}^{(i)}\): \[ \begin{aligned} U= \begin{bmatrix} a_{11}^{(1)} & a_{12}^{(1)} & \cdots & a_{1n}^{(1)} \\ 0 & a_{22}^{(2)} & \ddots & \vdots \\ \vdots & \ddots & \ddots & a_{n-1,n}^{(n-1)} \\ 0 & \cdots & 0 & a_{nn}^{(n)} \end{bmatrix},\ L= \begin{bmatrix} 1 & 0 & \cdots & 0 \\ m_{21} & 1 & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ m_{n1} & \cdots & m_{n,n-1} & 1 \end{bmatrix} \end{aligned} \]

Permutation Matrices

Suppose \(k_1,\ldots,k_n\) is a permutation of \(1,\ldots,n\). The permutation matrix \(P=(p_{ij})\) is defined by \[ \begin{aligned} p_{ij} = \begin{cases} 1, & \text{if } j=k_i,\\ 0, & \text{otherwise.} \end{cases} \end{aligned} \]

\(PA\) permutes the rows of \(A\): \[ \begin{aligned} PA = \begin{bmatrix} a_{k_11} & \cdots & a_{k_1n} \\ \vdots & \ddots & \vdots \\ a_{k_n1} & \cdots & a_{k_nn} \end{bmatrix} \end{aligned} \]
\(P^{-1}\) exists and \(P^{-1}=P^t\)

Gaussian elimination with row interchanges then becomes: \[ \begin{aligned} A=P^{-1}LU = (P^t L ) U \end{aligned} \]

Diagonally Dominant Matrices

Definition

The \(n\times n\) matrix \(A\) is said to be strictly diagonally dominant when \[ \begin{aligned} |a_{ii}|>\sum_{j\ne i} |a_{ij}| \end{aligned} \]

Theorem

A strictly diagonally dominant matrix \(A\) is nonsingular, Gaussian elimination can be performed on \(A\mathbf{x}=\mathbf{b}\) without row interchanges, and the computations will be stable.

Positive Definite Matrices

Definition

A matrix \(A\) is positive definite if it is symmetric and if \(\mathbf{x}^t A \mathbf{x} > 0\) for every \(\mathbf{x}\ne \mathbf{0}\).

Theorem

If \(A\) is an \(n\times n\) positive definite matrix, then

\(A\) has an inverse
\(a_{ii}>0\)
\(\max_{1\le k,j\le n} |a_{kj}| \le \max_{1\le i \le n} |a_{ii}|\)
\((a_{ij})^2<a_{ii}a_{jj}\) for \(i\ne j\)

Principal Submatrices

Definition

A leading principal submatrix of a matrix \(A\) is a matrix of the form \[ \begin{aligned} A_k = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1k} \\ a_{21} & a_{22} & \cdots & a_{2k} \\ \vdots & \vdots & & \vdots \\ a_{k1} & a_{k2} & \cdots & a_{kk} \end{bmatrix} \end{aligned} \] for some \(1\le k \le n\).

Theorem

A symmetric matrix \(A\) is positive definite if and only if each of its leading principal submatrices has a positive determinant.

SPD and Gaussian Elimination

Theorem

The symmetric matrix \(A\) is positive definite if and only if Gaussian elimination without row interchanges can be done on \(A\mathbf{x}=\mathbf{b}\) with all pivot elements positive, and the computations are then stable.

Corollary

The matrix \(A\) is positive definite if and only if it can be factored \(A=LDL^t\) where \(L\) is lower triangular with \(1\)’s on its diagonal and \(D\) is diagonal with positive diagonal entries.

Corollary

The matrix \(A\) is positive definite if and only if it can be factored \(A=LL^t\), where \(L\) is lower triangular with nonzero diagonal entries.

Band Matrices

Definition

An \(n\times n\) matrix is called a band matrix if \(p,q\) exist with \(1<p,q<n\) and \(a_{ij}=0\) when \(p\le j-i\) or \(q\le i-j\). The bandwidth is \(w=p+q-1\).

A tridiagonal matrix has \(p=q=2\) and bandwidth \(3\).

Theorem

Suppose \(A=[a_{ij}]\) is tridiagonal with \(a_{i,i-1}a_{i,i+1}\ne 0\). If \(|a_{11}|>|a_{12}|\), \(|a_{ii}|\ge |a_{i,i-1}|+|a_{i,i+1}|\), and \(|a_{nn}|>|a_{n,n-1}|\), then \(A\) is nonsingular.

Chapter 6 Direct Methods for Solving Linear Systems

Direct Methods for Linear Systems

Gaussian Elimination

Gaussian Elimination with Backward Substitution

Row Exchanges

Definition

Operation Counts

Reduction to Triangular Form

Operation Counts

Backward Substitution

Operation Counts

Gaussian Elimination Total Operation Count

Partial Pivoting

Scaled Partial Pivoting

Linear Algebra

Definition

Definition

Definition

Properties

Theorem

Matrix Multiplication

Definition

Special Matrices

Definition

Definition

Properties

Theorem

Matrix Inversion

Definition

Theorem

Matrix Transpose

Definition

Theorem

Determinants

Definition

Properties

Theorem

Linear Systems and Determinants

Theorem

LU Factorization

LU Factorization

LU Factorization

LU Factorization

Theorem

Permutation Matrices

Diagonally Dominant Matrices

Definition

Theorem

Positive Definite Matrices

Definition

Theorem

Principal Submatrices

Definition

Theorem

SPD and Gaussian Elimination

Theorem

Corollary

Corollary

Band Matrices

Definition

Theorem

Chapter 6
Direct Methods for Solving Linear Systems