17.2. Sparse Matrices in Julia#
Julia supports sparse matrices in the SparseMatrixCSC
type. It uses
the CSC format, and the datatype Tv
for the nonzeros and all indices Ti
can optionally be specified, SparseMatrixCSC{Tv,Ti}
.
Some special sparse matrices can be created using the following functions (together with their dense equivalents):
Sparse 
Dense 
Description 



mbyn matrix of zeros 


nbyn identity matrix 


Interconverts between dense and sparse formats 


mbyn random matrix (uniform) of density d 


mbyn random matrix (normal) of density d 
More general sparse matrices can be created with the syntax A = sparse(rows,cols,vals)
which
takes a vector rows
of row indices, a vector cols
of column indices,
and a vector vals
of stored values (essentially the COO format).
The inverse of this syntax is rows,cols,vals = findnz(A)
.
The number of nonzeros of a matrix A
are returned by the nnz(A)
function.
17.2.1. Example#
For the matrix considered above, the easiest approach is to start from the COO format
and use sparse(rows, cols, vals)
. The size of the matrix is determined from the
indices, if needed it can also be specified as sparse(rows, cols, vals, m, n)
.
using PyPlot, SparseArrays, LinearAlgebra # Packages used
rows = [1,3,4,2,1,3,1,4,1,5]
cols = [1,1,1,2,3,3,4,4,5,5]
vals = [5,2,4,5,3,1,2,10,7,9]
A = sparse(rows, cols, vals, 5, 5)
5×5 SparseMatrixCSC{Int64, Int64} with 10 stored entries:
5 ⋅ 3 2 7
⋅ 5 ⋅ ⋅ ⋅
2 ⋅ 1 ⋅ ⋅
4 ⋅ ⋅ 10 ⋅
⋅ ⋅ ⋅ ⋅ 9
We note that Julia only displays the nonzeros in the matrix. If needed, it can be converted to a dense matrix:
Array(A)
5×5 Matrix{Int64}:
5 0 3 2 7
0 5 0 0 0
2 0 1 0 0
4 0 0 10 0
0 0 0 0 9
But in many cases, it is enough to only show the sparsity pattern of the matrix (not the actual values). PyPlot can visualize this using a socalled spy plot:
spy(A, marker=".", markersize=24); ## Note  0based row and columns
17.2.2. Operations on sparse matrices#
Many operations work exactly the same for sparse matrices as for dense matrices, including arithmetic operations, indexing, assignment, and concatenation:
B = A  4.3A # Will automatically convert datatype of values to Float64
B[:,4] .= 1.1 # OK since B now has Float64 values (otherwise use Float64.(A) to convert)
C = A * A' # Matrix multiplication (note: typically increases nnz)
Matrix([B C]) # Concatenation, again automatic conversion (of C)
5×10 Matrix{Float64}:
16.5 0.0 9.9 1.1 23.1 87.0 0.0 7.0 0.0 63.0
0.0 16.5 0.0 1.1 0.0 0.0 25.0 0.0 0.0 0.0
6.6 0.0 3.3 1.1 0.0 7.0 0.0 5.0 8.0 0.0
13.2 0.0 0.0 1.1 0.0 0.0 0.0 8.0 116.0 0.0
0.0 0.0 0.0 1.1 29.7 63.0 0.0 0.0 0.0 81.0
However, note that some standard operations can make the matrix more dense, and it might
not make sense to use a sparse storage format for the result. Also, inserting new elements
is expensive (for example the operation on the 4th column of B
in the example above).
17.2.3. Incremental matrix construction#
Since Julia uses the CSC format for sparse matrices, it is inefficient to create matrices incrementally (that is, to insert new nonzeros into the matrix). As an example, consider building a matrix using a forloop. We start with an empty sparse matrix of given size \(N\)by\(N\), and insert a total of \(10N\) new random entries at random positions.
"""
Incremental matrix construction using the sparseformat
Not recommended: Insertion into existing matrix very slow
"""
function incremental_test_1(N)
A = spzeros(N,N)
for k = 1:10N
i,j = rand(1:N, 2)
A[i,j] = rand()
end
return A
end
incremental_test_1
We time the function for increasing values of \(N\):
incremental_test_1(10); # Force compile before timing
for N in [100,1000,10000]
@time incremental_test_1(N);
end
0.000124 seconds (1.01 k allocations: 122.719 KiB)
0.005774 seconds (10.02 k allocations: 1.409 MiB)
0.771326 seconds (100.02 k allocations: 11.361 MiB)
We can observe the approximately quadratic dependency on \(N\), even though the number of nonzeros is only proportional to \(N\). This is because of the inefficiencies with element insertion into a sparse matrix.
Instead, we can build the same matrix using the COO format (row, column, and value indices)
and only call sparse
ones:
"""
Incremental matrix construction using COO and a single call to sparse
Fast approach, avoids incremental insertion into existing array
"""
function incremental_test_2(N)
rows = Int64[]
cols = Int64[]
vals = Float64[]
for i = 1:10N
push!(rows, rand(1:N))
push!(cols, rand(1:N))
push!(vals, rand())
end
return sparse(rows, cols, vals, N, N)
end
incremental_test_2
incremental_test_2(10); # Force compile before timing
for N in [100,1000,10000,100000,1000000]
@time incremental_test_2(N);
end
0.000042 seconds (28 allocations: 99.500 KiB)
0.000494 seconds (41 allocations: 1.285 MiB)
0.017001 seconds (50 allocations: 8.764 MiB, 55.91% gc time)
0.115986 seconds (59 allocations: 62.151 MiB, 4.34% gc time)
2.404683 seconds (71 allocations: 476.117 MiB, 0.28% gc time)
This version is magnitudes faster than the previous one, although it does not quite achieve
linear dependency on \(N\) (possibly because of the sorting inside sparse
).